Metworx Client's Network Requirements


The Metworx GUI runs in an AWS account/VPC owned and managed by Metrum Research Group and exposes a url for Metworx management and administration accessible to the clients.

Metworx HPC clusters are launched in the client's AWS environment, usually within private subnets in client managed VPC.

Below is the high level diagram of the setup options for the Metworx GUI:

Client VPC Requirements

The following should be used as a checklist for VPC setup in a client environment.

Requirement Description Notes
DNS Resolution Cluster nodes need to be able to resolve themselves when they are launched in a client's VPC. DNS has to propagate within seconds after EC2 instances are launched. DNS hostnames and DNS resolution have to be enabled in a VPC If a custom domain name is used/expected, DHCP options can be leveraged to set the domain name on the nodes. DHCP server options should point to DNS servers that clusters should be using (AWS or Client-owned, etc. )
Time Servers need to be able to get NTP date from a reliable source In most cases, default AWS ntp setup (set via DHCP options for VPC ) is sufficient.
External Connectivity and Routes Cluster nodes need to have external routes and connectivity to some resources outside of the VPC. Those resources are trusted configuration repositories reachable via HTTPS/SSL protocol. Please see below. VPC must have a route to the internet, typically via NAT gateway. Outbound (egress) security group rules must allow for https (port 443) traffic
Subnet Size VPC subnets should be sized large enough to allow for all cluster nodes/clusters to run simultaneously. Account limits for number of EC2 instances should be checked to make sure enough instances of the desired size can be spawn up. If clusters are launched in Public subnets, EIP limits need to be raised to make sure that EIP limit is as large as EC2 limit.
VPC Endpoints We recomend that private VPC endpoints are created for access to AWS services such as S3, CloudFormation, AutoScaling, CloudTrail, CloudWatch, Dynamodb, especially if outbound connectivity is limited or slow. For more information, refer to https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints.html

External (Outbound) Connectivity Needed

Metworx HPC clusters need to be able to reach the following resources outside of AWS account. It is strongly recommended that AWS Default outbound rules are NOT removed. In case least privilege external connectivity is required, these are the bare minimum dns-based outbound rules required.

Resource Location
Metworx Cloudformation Template and Custom Cookbooks S3 Bucket via HTTPS/443 (ie: https://s3-us-west-2.amazonaws.com/metworx-cookbooks/*)
Ubuntu Official Patch Repository Ubuntu AWS repositories in each region (https/443) (ie: http://*.ubuntu.com/)
Cran Repository https://cran.r-project.org
Github Repos (HTTPS) Github.com (used for pulling from metrumresearchgroup releases)
MPN https://mpn.metworx.com

Note: If outbound connectivity to AWS Service/API endpoints is not opened, VPC endpoints for common services must be created (S3, Autoscaling, CloudFormation, CloudWatch, CloudTrail at the minimum)

Inbound Connectivity

The following connectivity is needed to the subnet where Metworx clusters are run. Only the master node has to have external access, network access to compute nodes from outside of the cluster is blocked by compute node security groups.

Port Description App
22 SSH to master node ssh
443 HTTPS R-Studio/RS-connect/Guacamole desktop