diff --git a/calico-cloud/networking/egress/egress-gateway-host-ip.mdx b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx new file mode 100644 index 0000000000..7be8194933 --- /dev/null +++ b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx @@ -0,0 +1,647 @@ +--- +description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. +--- + +# Configure egress gateways with Host IP support + +## Big picture + +Configure specific application traffic to exit the cluster through an egress gateway, using the +gateway's **host (node) IP** as the source address for traffic leaving the cluster. + +## Value + +When traffic from particular applications leaves the cluster to access an external destination, it +can be useful to control the source IP of that traffic. For example, there may be an additional +firewall around the cluster, whose purpose includes policing external accesses from the cluster, and +specifically that particular external destinations can only be accessed from authorized workloads +within the cluster. + +In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the +**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod +IP. This is useful when external firewalls or services need to allowlist traffic based on a +stable set of known node IPs, or when pod IPs are not routable outside the cluster. + +By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP +pool, you ensure that all application traffic routed through those gateways exits the cluster with +the node IP of the gateway's host as the source address. Any number of application pods can have +their outbound connections multiplexed through a fixed small number of egress gateways, and all of +those outbound connections will appear to come from the gateway nodes' IPs. + +:::note + +The source port of an outbound flow through an egress gateway can generally _not_ be +preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single +downstream IP. + +::: + +Egress gateways with host IP support are particularly useful when you want all outbound traffic from +a particular application to leave the cluster through a particular node or nodes, and to appear as +traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the +application pods/namespaces are configured to use those gateways. + +## Concepts + +### Egress gateway + +An egress gateway acts as a transit pod for the outbound application traffic that is configured to +use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed +before the traffic is forwarded on. + +### Source IP with host IP mode + +When an outbound application flow leaves the cluster through an egress gateway, the source IP +depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). + +- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) + IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in + this guide. +- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. + +In host IP mode, external services and firewalls see connections arriving from the egress gateway's +node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall +allowlisting. + +### Control the use of egress gateways + +If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be +important to control when cluster users can configure their pods and namespaces to use them, so that +non-special pods cannot impersonate the special meaning. + +If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress +gateway function only on a per-namespace basis. Then only cluster admins will be able to configure +any egress gateway usage. + +Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress +gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) + will be +needed. This is a task for each deployment to implement for itself, but possible approaches include +the following. + +1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on + other details of the Namespace or Pod definition. + +1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and + decide whether that usage is acceptable. + +1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways + that it wants to use, less general than $[prodname]'s egress annotations. Then the + admission controller would police those bespoke annotations (that that cluster's users could + place on Namespace or Pod resources) and either reject the operation in hand, or allow it + through after adding the corresponding $[prodname] egress annotations. + +### Policy enforcement for flows via an egress gateway + +For an outbound connection from a client pod, via an egress gateway, to a destination outside the +cluster, there is more than one possible enforcement point for policy: + +The path of the traffic through policy is as follows: + +1. Packet leaves the client pod and passes through its egress policy. +2. The packet is encapsulated by the client pod's host and sent to the egress gateway +3. The encapsulated packet is sent from the host to the egress gateway pod. +4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. +5. The packet leaves the egress gateway pod through its egress policy. + +To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by +$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where +policy can be applied: + +1. on egress from the client pod +2. on egress from the egress gateway pod (see limitations below). + +The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by +virtue of being attached to that original source). It also sees the external destination of the traffic. + +Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and +to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). + +Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should +the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: + +- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress + gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway + happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. + By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. + +- The traffic source will appear to be the egress gateway pod, the source information is lost in the address + translation that occurs inside the egress gateway pod. + +That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, +either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little +utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. + +:::note + +Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where +their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress +to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming +patch release. + +::: + +## Before you begin + +**Required** + +- Calico CNI +- Open port UDP 4790 on the host + +**Not Supported** + +- GKE +- Windows + +## How to + +- [Enable egress gateway support](#enable-egress-gateway-support) +- [Provision an egress IP pool](#provision-an-egress-ip-pool) +- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) +- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) +- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) +- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) +- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) +- [Verify the feature operation](#verify-the-feature-operation) +- [Control the use of egress gateways](#control-the-use-of-egress-gateways) +- [Upgrade egress gateways](#upgrade-egress-gateways) + +### Enable egress gateway support + +In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or +`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For +support on a per-namespace basis only: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' +``` + +Or for support both per-namespace and per-pod: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +``` + +:::note + +- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the + `default` FelixConfiguration resource. +- The operator automatically enables the required policy sync API in the FelixConfiguration. + +::: + +### Provision an egress IP pool + +Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress +gateways using this pool is source-NATed to the host IP of the node running the gateway pod. + +```bash +kubectl apply -f - < + terminationGracePeriodSeconds: 0 +EOF +``` + +Replace `` with the hostname of the node where you want the egress gateway to +run. Traffic passing through this gateway will exit the cluster with this node's IP as the source +address. + +:::note + +When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: + +##### Label +``` +openshift.io/run-level: "0" +pod-security.kubernetes.io/enforce: privileged +pod-security.kubernetes.io/enforce-version: latest +``` +##### Annotation +``` +security.openshift.io/scc.podSecurityLabelSync: "false" +``` +::: + +Where: + +- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. + +- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). + +- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. + + If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. + +- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the + egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. + + ```yaml + icmpProbe: + ips: + - probeIP + - probeIP + timeoutSeconds: 20 + intervalSeconds: 10 + ``` + +- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the + egress gateway pod will probe each external service periodically. If all probes fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. + + ```yaml + httpProbe: + urls: + - probeURL + - probeURL + timeoutSeconds: 30 + intervalSeconds: 10 + ``` +- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. + +The health port `8080` is used by: + +- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP + probes). +- Remote pods to check if the egress gateway is "ready". Only "ready" egress + gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and + no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named + "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. + +### Deploy on a RKE2 CIS-hardened cluster + +If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. + +### Configure iptables backend for egress gateways + +The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. +To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). + +### Configure IP autodetection for dual-ToR clusters. + +If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP +auto-detection method to pick up the stable IP, for example using the `interface: lo` setting +(The default first-found setting skips over the lo interface). This can be configured via the +$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). + +### Affine a client pod to a specific node + +In host IP mode, you may want to control which node your client pods run on to ensure deterministic +routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific +node: + +```bash +kubectl apply -f - < + containers: + - name: alpine + image: alpine + command: ["/bin/sleep"] + args: ["infinity"] +EOF +``` + +Replace `` with the hostname of the desired node. When combined with an egress gateway +policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic +through an egress gateway on the same node, ensuring the traffic exits with that node's IP. + +### Configure namespaces and pods to use egress gateways + +You can configure namespaces and pods to use an egress gateway by: +* annotating the namespace or pod +* applying an egress gateway policy to the namespace or pod. + +Using an egress gateway policy is more complicated, but it allows advanced use cases. + +#### Configure a namespace or pod to use an egress gateway (annotation method) + +In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that +tell that namespace or pod to use particular egress gateways. These annotations are selectors, and +their meaning is "the set of pods, anywhere in the cluster, that match those selectors". + +So, to configure all the pods in a namespace to use the egress gateways that are +labelled with `egress-code: red`, you would annotate that namespace like this: + +```bash +kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" +``` + +By default, that selector can only match egress gateways in the same namespace. To select gateways +in a different namespace, specify a `namespaceSelector` annotation as well, like this: + +```bash +kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" +``` + +Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in +$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). + +To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when +creating the pod. For example: + +```bash +kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" +``` + +To configure a specific Kubernetes pod to use the same policy, specify the same annotations when +creating the pod. +For example: + +```bash +kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' +``` + +By way of a concrete example, you could use netcat to run a test server outside the cluster; for +example: + +```bash +docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 +``` + +Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. + +Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: + +```bash +kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. + +Then, if you check the logs or output of the netcat server, you should see: + +``` +Connection from received +``` + +with `` being the **node IP** of the host where the egress gateway pod is running (not +the gateway's pod IP or the egress IP pool IP). + +## Upgrade egress gateways + +From v3.16, egress gateway deployments are managed by the Tigera Operator. + +- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, + create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); + the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. + +- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait + for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator + will upgrade the existing egress gateway deployment with the new image. + +By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, +the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see +the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). + +## Additional resources + +Please see also: + +- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). +- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) diff --git a/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx new file mode 100644 index 0000000000..7be8194933 --- /dev/null +++ b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx @@ -0,0 +1,647 @@ +--- +description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. +--- + +# Configure egress gateways with Host IP support + +## Big picture + +Configure specific application traffic to exit the cluster through an egress gateway, using the +gateway's **host (node) IP** as the source address for traffic leaving the cluster. + +## Value + +When traffic from particular applications leaves the cluster to access an external destination, it +can be useful to control the source IP of that traffic. For example, there may be an additional +firewall around the cluster, whose purpose includes policing external accesses from the cluster, and +specifically that particular external destinations can only be accessed from authorized workloads +within the cluster. + +In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the +**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod +IP. This is useful when external firewalls or services need to allowlist traffic based on a +stable set of known node IPs, or when pod IPs are not routable outside the cluster. + +By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP +pool, you ensure that all application traffic routed through those gateways exits the cluster with +the node IP of the gateway's host as the source address. Any number of application pods can have +their outbound connections multiplexed through a fixed small number of egress gateways, and all of +those outbound connections will appear to come from the gateway nodes' IPs. + +:::note + +The source port of an outbound flow through an egress gateway can generally _not_ be +preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single +downstream IP. + +::: + +Egress gateways with host IP support are particularly useful when you want all outbound traffic from +a particular application to leave the cluster through a particular node or nodes, and to appear as +traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the +application pods/namespaces are configured to use those gateways. + +## Concepts + +### Egress gateway + +An egress gateway acts as a transit pod for the outbound application traffic that is configured to +use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed +before the traffic is forwarded on. + +### Source IP with host IP mode + +When an outbound application flow leaves the cluster through an egress gateway, the source IP +depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). + +- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) + IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in + this guide. +- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. + +In host IP mode, external services and firewalls see connections arriving from the egress gateway's +node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall +allowlisting. + +### Control the use of egress gateways + +If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be +important to control when cluster users can configure their pods and namespaces to use them, so that +non-special pods cannot impersonate the special meaning. + +If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress +gateway function only on a per-namespace basis. Then only cluster admins will be able to configure +any egress gateway usage. + +Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress +gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) + will be +needed. This is a task for each deployment to implement for itself, but possible approaches include +the following. + +1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on + other details of the Namespace or Pod definition. + +1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and + decide whether that usage is acceptable. + +1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways + that it wants to use, less general than $[prodname]'s egress annotations. Then the + admission controller would police those bespoke annotations (that that cluster's users could + place on Namespace or Pod resources) and either reject the operation in hand, or allow it + through after adding the corresponding $[prodname] egress annotations. + +### Policy enforcement for flows via an egress gateway + +For an outbound connection from a client pod, via an egress gateway, to a destination outside the +cluster, there is more than one possible enforcement point for policy: + +The path of the traffic through policy is as follows: + +1. Packet leaves the client pod and passes through its egress policy. +2. The packet is encapsulated by the client pod's host and sent to the egress gateway +3. The encapsulated packet is sent from the host to the egress gateway pod. +4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. +5. The packet leaves the egress gateway pod through its egress policy. + +To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by +$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where +policy can be applied: + +1. on egress from the client pod +2. on egress from the egress gateway pod (see limitations below). + +The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by +virtue of being attached to that original source). It also sees the external destination of the traffic. + +Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and +to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). + +Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should +the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: + +- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress + gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway + happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. + By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. + +- The traffic source will appear to be the egress gateway pod, the source information is lost in the address + translation that occurs inside the egress gateway pod. + +That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, +either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little +utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. + +:::note + +Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where +their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress +to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming +patch release. + +::: + +## Before you begin + +**Required** + +- Calico CNI +- Open port UDP 4790 on the host + +**Not Supported** + +- GKE +- Windows + +## How to + +- [Enable egress gateway support](#enable-egress-gateway-support) +- [Provision an egress IP pool](#provision-an-egress-ip-pool) +- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) +- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) +- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) +- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) +- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) +- [Verify the feature operation](#verify-the-feature-operation) +- [Control the use of egress gateways](#control-the-use-of-egress-gateways) +- [Upgrade egress gateways](#upgrade-egress-gateways) + +### Enable egress gateway support + +In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or +`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For +support on a per-namespace basis only: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' +``` + +Or for support both per-namespace and per-pod: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +``` + +:::note + +- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the + `default` FelixConfiguration resource. +- The operator automatically enables the required policy sync API in the FelixConfiguration. + +::: + +### Provision an egress IP pool + +Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress +gateways using this pool is source-NATed to the host IP of the node running the gateway pod. + +```bash +kubectl apply -f - < + terminationGracePeriodSeconds: 0 +EOF +``` + +Replace `` with the hostname of the node where you want the egress gateway to +run. Traffic passing through this gateway will exit the cluster with this node's IP as the source +address. + +:::note + +When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: + +##### Label +``` +openshift.io/run-level: "0" +pod-security.kubernetes.io/enforce: privileged +pod-security.kubernetes.io/enforce-version: latest +``` +##### Annotation +``` +security.openshift.io/scc.podSecurityLabelSync: "false" +``` +::: + +Where: + +- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. + +- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). + +- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. + + If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. + +- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the + egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. + + ```yaml + icmpProbe: + ips: + - probeIP + - probeIP + timeoutSeconds: 20 + intervalSeconds: 10 + ``` + +- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the + egress gateway pod will probe each external service periodically. If all probes fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. + + ```yaml + httpProbe: + urls: + - probeURL + - probeURL + timeoutSeconds: 30 + intervalSeconds: 10 + ``` +- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. + +The health port `8080` is used by: + +- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP + probes). +- Remote pods to check if the egress gateway is "ready". Only "ready" egress + gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and + no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named + "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. + +### Deploy on a RKE2 CIS-hardened cluster + +If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. + +### Configure iptables backend for egress gateways + +The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. +To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). + +### Configure IP autodetection for dual-ToR clusters. + +If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP +auto-detection method to pick up the stable IP, for example using the `interface: lo` setting +(The default first-found setting skips over the lo interface). This can be configured via the +$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). + +### Affine a client pod to a specific node + +In host IP mode, you may want to control which node your client pods run on to ensure deterministic +routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific +node: + +```bash +kubectl apply -f - < + containers: + - name: alpine + image: alpine + command: ["/bin/sleep"] + args: ["infinity"] +EOF +``` + +Replace `` with the hostname of the desired node. When combined with an egress gateway +policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic +through an egress gateway on the same node, ensuring the traffic exits with that node's IP. + +### Configure namespaces and pods to use egress gateways + +You can configure namespaces and pods to use an egress gateway by: +* annotating the namespace or pod +* applying an egress gateway policy to the namespace or pod. + +Using an egress gateway policy is more complicated, but it allows advanced use cases. + +#### Configure a namespace or pod to use an egress gateway (annotation method) + +In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that +tell that namespace or pod to use particular egress gateways. These annotations are selectors, and +their meaning is "the set of pods, anywhere in the cluster, that match those selectors". + +So, to configure all the pods in a namespace to use the egress gateways that are +labelled with `egress-code: red`, you would annotate that namespace like this: + +```bash +kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" +``` + +By default, that selector can only match egress gateways in the same namespace. To select gateways +in a different namespace, specify a `namespaceSelector` annotation as well, like this: + +```bash +kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" +``` + +Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in +$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). + +To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when +creating the pod. For example: + +```bash +kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" +``` + +To configure a specific Kubernetes pod to use the same policy, specify the same annotations when +creating the pod. +For example: + +```bash +kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' +``` + +By way of a concrete example, you could use netcat to run a test server outside the cluster; for +example: + +```bash +docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 +``` + +Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. + +Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: + +```bash +kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. + +Then, if you check the logs or output of the netcat server, you should see: + +``` +Connection from received +``` + +with `` being the **node IP** of the host where the egress gateway pod is running (not +the gateway's pod IP or the egress IP pool IP). + +## Upgrade egress gateways + +From v3.16, egress gateway deployments are managed by the Tigera Operator. + +- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, + create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); + the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. + +- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait + for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator + will upgrade the existing egress gateway deployment with the new image. + +By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, +the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see +the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). + +## Additional resources + +Please see also: + +- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). +- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) diff --git a/sidebars-calico-cloud.js b/sidebars-calico-cloud.js index eeb0e0105f..8eb1d11923 100644 --- a/sidebars-calico-cloud.js +++ b/sidebars-calico-cloud.js @@ -371,6 +371,7 @@ module.exports = { link: { type: 'doc', id: 'networking/egress/index' }, items: [ 'networking/egress/egress-gateway-on-prem', + 'networking/egress/egress-gateway-host-ip', 'networking/egress/egress-gateway-aws', 'networking/egress/egress-gateway-azure', 'networking/egress/egress-gateway-maintenance', diff --git a/sidebars-calico-enterprise.js b/sidebars-calico-enterprise.js index 1f4a792134..6296d3a831 100644 --- a/sidebars-calico-enterprise.js +++ b/sidebars-calico-enterprise.js @@ -191,6 +191,7 @@ module.exports = { link: { type: 'doc', id: 'networking/egress/index' }, items: [ 'networking/egress/egress-gateway-on-prem', + 'networking/egress/egress-gateway-host-ip', 'networking/egress/egress-gateway-azure', 'networking/egress/egress-gateway-aws', 'networking/egress/egress-gateway-maintenance',