Skip to content

Reference Architecture: NAT Gateway for DOKS and Droplet #27

@do-joe

Description

@do-joe

Goal
Create a new reference architecture in scale-with-simplicity that provisions:

  • A VPC
  • A VPC NAT Gateway
  • A DOKS cluster with a node pool (default 1+ nodes) and the Routing Agent enabled
  • A Route custom resource on the cluster that overrides the default route (0.0.0.0/0) to the NAT Gateway’s Routing table IP address (a.k.a. gateway IP) using the DOKS Routing Agent
  • A Droplet configured (via cloud-init) to set its default route to the NAT Gateway’s gateway IP and preserve access to the metadata endpoint

This should demonstrate a simple, reproducible pattern for making all egress traffic originate from a single static IP via NAT, for both cluster workloads and a non‑Kubernetes host.


Scope & constraints

  • CI/CD is out of scope (manual terraform apply is fine).

  • You cannot use the Kubernetes provider to apply a manifest in the same Terraform module that creates the cluster (providers can’t depend on resources they configure). Use two modules:

    1. infra module: VPC, NAT Gateway, DOKS (with routing agent), Droplet
    2. routes module: applies the Route CRD manifest to the created cluster
  • Alternatively, you may include an example YAML manifest and clear README instructions to apply it manually with kubectl. Either approach is acceptable.


Deliverables

  • A new folder under reference-architectures/ (suggested): reference-architectures/nat-gateway/

    • terraform/1-infra/ — Terraform for VPC, NAT Gateway, DOKS (+ routing agent), Droplet
    • terraform/2-routes/ — Terraform that uses the Kubernetes provider (or just example YAML) to apply a Route object
    • README.md — overview, variables/outputs, apply/verify/cleanup steps
  • Working Terraform code referencing the specific resources listed below (see References).

  • Cloud-init configuration for the Droplet to:

    • add a route to 169.254.169.254/32 via the original gateway (preserves metadata)
    • replace default route to the NAT Gateway’s gateway IP (private VPC address shown on the NAT Gateway details)
  • Route CRD that overrides default route for cluster nodes to the NAT Gateway’s gateway IP.


Suggested structure

reference-architectures/
  nat-gateway/
    terraform/
      1-infra/
        main.tf
        variables.tf
        outputs.tf
      2-routes/
        main.tf     # uses kubernetes_manifest OR ships example YAML + README steps
        variables.tf
        outputs.tf
    README.md

Implementation notes

NOTE: The examples below have not been validated. While they should be syntactically correct, its possible they are not. Refer to the documentation if you get errors when trying to apply them.

1) Terraform: required resources

Use these providers/resources explicitly:

Tip: in the routes module, configure the Kubernetes provider using a data source (e.g., data.digitalocean_kubernetes_cluster) to pull the cluster endpoint/credentials.

2) DOKS cluster: enable the Routing Agent

In the digitalocean_kubernetes_cluster resource, include a routing_agent block:

resource "digitalocean_kubernetes_cluster" "this" {
  name    = var.cluster_name
  region  = var.region
  version = var.k8s_version

  routing_agent {
    enabled = true
  }

  node_pool {
    name       = "np-default"
    size       = var.node_size
    node_count = var.node_count
  }
}

3) Route CRD: override default route to NAT Gateway

Important: The Route spec must point to the NAT Gateway’s Routing table IP address (gateway IP on the VPC), not its public IP. Provide this value to the module (e.g., output from the digitalocean_vpc_nat_gateway or read-time instruction in README).

Example manifest to ship/apply (DOKS Routing Agent):

apiVersion: networking.doks.digitalocean.com/v1alpha1
kind: Route
metadata:
  name: default-egress-via-nat
spec:
  destinations:
    - "0.0.0.0/0"
  gateways:
    - "${nat_gateway_gateway_ip}" # VPC gateway IP shown on the NAT Gateway details page

If using Terraform kubernetes_manifest, template the gateway IP in via a variable.

4) Droplet: cloud-init to persist routing

Provide a minimal Ubuntu-compatible cloud-init that preserves metadata access and sets the default route to the NAT Gateway gateway IP. It’s fine to use runcmd to apply immediately and write a Netplan file for persistence.

#cloud-config
package_update: false
package_upgrade: false

write_files:
  - path: /etc/netplan/99-natgw.yaml
    permissions: "0644"
    content: |
      network:
        version: 2
        ethernets:
          eth0:
            routes:
              - to: 169.254.169.254/32
                via: ${original_gateway}
          eth1:
            routes:
              - to: 0.0.0.0/0
                via: ${nat_gateway_gateway_ip}

runcmd:
  - original_gw=$(curl -s http://169.254.169.254/metadata/v1/interfaces/public/0/ipv4/gateway)
  - ip route add 169.254.169.254 via $original_gw dev eth0 || true
  - ip route replace default via ${nat_gateway_gateway_ip}
  - netplan apply || true

You may compute ${original_gateway} in Terraform via a template variable or leave instructions in the README to replace it manually for the demo.


Variables (suggested)

  • region (default: sfo3)
  • vpc_name
  • nat_gateway_name
  • cluster_name
  • k8s_version (default: latest stable x.y-do.z)
  • node_size (default: s-2vcpu-4gb)
  • node_count (default: 1)
  • droplet_size (default: s-1vcpu-1gb)
  • droplet_image (default: ubuntu-22-04-x64)
  • droplet_user_data (string/template for cloud-init)
  • nat_gateway_gateway_ip (string; the Routing table IP to use in Route and Droplet)

Expose outputs for: VPC ID, NAT Gateway ID & gateway IP, cluster ID/name, kubeconfig (if you choose to output it), Droplet private IP.


README: required content

  • What this RA does and an architecture diagram (optional ASCII is fine)

  • Prereqs: Terraform, doctl, export DIGITALOCEAN_ACCESS_TOKEN

  • How to apply:

    1. terraform init && terraform apply in examples/simple/ to create infra
    2. Obtain NAT Gateway Routing table IP address and pass it to the routes module (or edit YAML)
    3. Apply routes module or kubectl apply -f route.yaml
  • How to verify:

    • kubectl run -it --rm test --image=busybox --restart=Never -- sh -c "wget -qO- ifconfig.me"
    • Confirm the returned IP equals the NAT Gateway public IP
    • SSH to Droplet and run curl ifconfig.me
  • How to destroy (and order of operations if the Route needs removal first)

  • Notes: differences between gateway IP (VPC) vs NAT public IP; metadata route preservation on Droplets; mention ECMP if multiple gateways are added later.


References


Definition of Done

  • The example runs end‑to‑end without manual edits beyond supplying the NAT Gateway Routing table IP where required
  • Documentation is clear enough for a newcomer to replicate and verify egress IP from both cluster pods and the Droplet
  • Code is formatted and validated (terraform fmt / terraform validate)

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions