Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress #4017

kahirokunn · 2025-01-15T06:40:57Z

Proposed Unified Solution

Enhance the AWS Load Balancer Controller to directly register IP addresses from EndpointSlices in “target-type: ip” mode, even if those addresses are intended for multi-cluster usage (MCS) or represent external endpoints. This can be done by:

Recognizing that an EndpointSlice may contain additional or external IP addresses (for instance, based on TargetRef.Kind != "Pod").
Incorporating these addresses into the Target Group, alongside the local cluster Pod IPs already handled.

A relevant part of the AWS Load Balancer Controller’s current design is located here:

aws-load-balancer-controller/pkg/backend/endpoint_resolver.go

Lines 155 to 157 in c701a42

    
           if ep.TargetRef == nil || ep.TargetRef.Kind != "Pod" { 
        
           	continue 
        
           }

Here, the logic could be extended to handle these alternative address types. For example, if the endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io label is missing, the Controller might treat the EndpointSlice’s IP addresses as external IPs; or if EndpointSlice.Endpoints[].TargetRef.Kind != "Pod", the Controller might interpret them as external endpoints.

In both cases, the goal remains the same: provide direct integration with new or external IP addresses listed in EndpointSlices, reducing complexity and offering more efficient traffic routing.

Alternatives Considered

Using “target-type: instance”

This solution leads to indirect routing (through NodePorts) and higher susceptibility to disruptions upon Node scale-in or replacement.

Example: MCS with Additional Cluster IPs

Below is a sample configuration demonstrating how MCS might export a Service, creating an EndpointSlice in one cluster with Pod IPs from another cluster:

apiVersion: v1
kind: Service
metadata:
  name: example-service
  namespace: default
spec:
  selector:
    app: example
  ports:
    - name: http
      port: 80
      protocol: TCP
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  namespace: default
  annotations:
    kubernetes.io/ingress.class: alb
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP":80}]'
spec:
  rules:
    - http:
        paths:
          - path: /*
            pathType: ImplementationSpecific
            backend:
              service:
                name: example-service
                port:
                  number: 80
---
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: example-service-remotecluster
  namespace: default
  labels:
    kubernetes.io/service-name: example-service
addressType: IPv4
ports:
  - name: "http"
    port: 80
    protocol: TCP
endpoints:
  - addresses:
      - 10.11.12.13   # Pod IP on a remote EKS cluster
    conditions:
      ready: true
      serving: true
      terminating: false
    nodeName: remote-node-1
    zone: remote-az-1

With the proposed feature enabled, the IP “10.11.12.13” would be recognized by the AWS Load Balancer Controller and automatically registered in the Target Group.

References

The text was updated successfully, but these errors were encountered:

zac-nixon · 2025-01-17T20:16:56Z

Could you expand further on this point:

This solution leads to indirect routing (through NodePorts) and higher susceptibility to disruptions upon Node scale-in or replacement.

Later versions of Kubernetes and the controller have made using NodePorts for traffic a lot more reliable. For example, when using cluster autoscaler: #1688

kahirokunn · 2025-01-20T01:25:06Z

@zac-nixon
Thank you for your insight and all the work you've done on this project. I wanted to share my experience using Karpenter instead of the Cluster Autoscaler. In my tests, when running ab (ApacheBench) or other load-testing tools while a node scales in, I often observe connections that do not return any response (instead of a 5xx error). After multiple rounds of verification, I suspect the following factors may be playing a role:

Karpenter may terminate a node before it is fully deregistered from the ALB’s Target Group.
There may be insufficient coordination between Karpenter and the AWS Load Balancer Controller during node termination.
Any long-lived connections—such as WebSockets, long polling, or HTTP/2—remain open on nodes that are about to be terminated. Moreover, slower requests and long-running processes also stay active. As a result, when Karpenter scales in a node, these open connections or requests can be abruptly severed, causing no response to return to the client.

Additionally, by supporting direct IP-based communication as described in the Kubernetes documentation—rather than routing traffic exclusively through Nodes—we can further improve interoperability with existing controllers, foster additional integrations, and enable even more significant innovation in future.

kahirokunn · 2025-01-20T02:04:26Z

I've created a separate issue regarding the problem we discussed about AWS Load Balancer Controller not handling Karpenter taints:
#4023
Along with this, I've also created a related PR:
#4022
However, I still want to continue the discussion about Ingress resources supporting custom EndpointSlices, as I believe this is a needed feature.
Thx 🙏

zac-nixon · 2025-01-23T04:07:12Z

Sorry for the delayed response. What automation are you using to populate the custom endpoint slice? I wonder if you can use a Multicluster Target Group Binding (https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/targetgroupbinding/targetgroupbinding/#multicluster-target-group) and then point your automation to just register the targets directly into the Target Group?

kahirokunn · 2025-01-23T07:06:46Z

I am currently trying to implement an MCS controller using Sveltos (Related Issue: projectsveltos/sveltos#435 (comment)).
While the proposed Multicluster Target Group Binding could achieve something similar, I believe there are challenges in the following areas:

ALB and Listener need to be managed by separate tools like Terraform or Crossplane
AWS Load Balancer Controller and information required for its operation need to be distributed to all clusters, increasing additional setup and management costs
Not compatible with sig-multicluster, making it difficult to extend and apply in the long term

On the other hand, if AWS Load Balancer Controller directly supports Custom EndpointSlices, which is a Kubernetes standard specification, the complicated setup mentioned above would become unnecessary. I believe this approach is preferable in terms of achieving the configuration that users ultimately need in a simpler way.

kahirokunn changed the title ~~FeatureRequest: Support EndpointSlices Without In-cluster Pod Targets in Ingress~~ Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress Jan 15, 2025

shraddhabang added triage/needs-investigation kind/feature Categorizes issue or PR as related to a new feature. and removed triage/needs-investigation labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress #4017

Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress #4017

kahirokunn commented Jan 15, 2025 •

edited

Loading

zac-nixon commented Jan 17, 2025

kahirokunn commented Jan 20, 2025

kahirokunn commented Jan 20, 2025

zac-nixon commented Jan 23, 2025

kahirokunn commented Jan 23, 2025

Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress #4017

Feature Request: Support EndpointSlices Without In-cluster Pod Targets in Ingress #4017

Comments

kahirokunn commented Jan 15, 2025 • edited Loading

Related Problem

Proposed Unified Solution

Alternatives Considered

Example: MCS with Additional Cluster IPs

References

zac-nixon commented Jan 17, 2025

kahirokunn commented Jan 20, 2025

kahirokunn commented Jan 20, 2025

zac-nixon commented Jan 23, 2025

kahirokunn commented Jan 23, 2025

kahirokunn commented Jan 15, 2025 •

edited

Loading