Skip to content

feat: implement node selector for nodeport svc#11202

Open
tanujd11 wants to merge 2 commits intoprojectcalico:masterfrom
tanujd11:feat/nodeport-node-selector
Open

feat: implement node selector for nodeport svc#11202
tanujd11 wants to merge 2 commits intoprojectcalico:masterfrom
tanujd11:feat/nodeport-node-selector

Conversation

@tanujd11
Copy link
Contributor

@tanujd11 tanujd11 commented Oct 17, 2025

Description

Adds a projectcalico.org/nodeSelector annotation to selectively expose NodePort services on specific nodes based on node labels. Designed for multi-tenant clusters where tenants have dedicated nodes.

Related issues/PRs

fixes #5495

Todos

  • Tests
  • Documentation
  • Release note

Release Note

ebpf: expose annotated nodeports on selected nodes only

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

Signed-off-by: tanujd11 <dwiveditanuj41@gmail.com>
Copilot AI review requested due to automatic review settings October 17, 2025 08:00
@tanujd11 tanujd11 requested a review from a team as a code owner October 17, 2025 08:00
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Oct 17, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Oct 17, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements selective NodePort service exposure based on node labels, enabling multi-tenant clusters where tenants have dedicated nodes. Services can now use the projectcalico.org/nodeSelector annotation to control which nodes expose their NodePort services.

Key Changes:

  • Adds node selector evaluation logic that checks node labels against service annotations before programming NodePort entries
  • Implements node label tracking and update callbacks in the BPF dataplane components
  • Extends the BPF proxy syncer to support node label-based filtering of NodePort services

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
felix/dataplane/linux/int_dataplane.go Registers node labels update callback with BPF route manager
felix/dataplane/linux/bpf_route_mgr.go Adds node labels tracking and callback mechanism for label updates
felix/bpf/proxy/syncer_test.go Adds comprehensive test coverage for node selector functionality
felix/bpf/proxy/syncer_bench_test.go Updates benchmark tests to pass nil for new nodeLabels parameter
felix/bpf/proxy/syncer.go Implements core node selector matching logic and extends NewSyncer signature
felix/bpf/proxy/proxy_bench_test.go Updates benchmark to pass nil for new nodeLabels parameter
felix/bpf/proxy/proxy.go Adds NodeSelector annotation constant and service annotation interface
felix/bpf/proxy/lb_src_range_test.go Updates tests to pass nil for new nodeLabels parameter
felix/bpf/proxy/kube-proxy.go Implements node label fetching and resync logic on label changes

Signed-off-by: tanujd11 <dwiveditanuj41@gmail.com>
@tanujd11 tanujd11 requested a review from Copilot October 17, 2025 11:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated no new comments.

Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just doing a first pass here - I think the main thing is that we need to get rid of the explicit Get() calls to the API server and instead rely on syncer data.

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

node, err := kp.k8s.CoreV1().Nodes().Get(ctx, kp.hostname, metav1.GetOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't be performing any Get requests from within the Felix code. Felix runs as a DaemonSet on every node in the cluster, and for very large clusters this can result in problems with scale.

We will need to feed this via the Syncer which is a Watch API fed by Typha for scale purposes. I believe Felix is already receiving node labels over that API (or at least has the ability to without much fuss)

}

// NewK8sServicePortWithSelector creates a new k8s ServicePort with a node selector
func NewK8sServicePortWithSelector(clusterIP net.IP, port int, proto v1.Protocol,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is only used in test code, so should probably live in the _test.go file

// Trigger a resync with the current host IPs
// This will recreate the syncer with updated node labels
if len(hostIPs) > 0 {
kp.OnHostIPsUpdate(hostIPs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are already getting node label updates streamed to us, so I think instead of calling OnHostIPsUpdate here we could introduce a new method for updating the labels that triggers a re-evaluation of Service programming, getting rid of the need for the Get() call mentioned above, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I was going to say that we should perhaps use the same/similar approach as when updating the host IPs. Both is expected to be rather infrequent event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the OnHostIPsUpdate example. To avoid possibly restarting kube-proxy 2x, it might be the best to move https://github.com/projectcalico/calico/blob/master/felix/dataplane/linux/bpf_route_mgr.go#L590 to https://github.com/projectcalico/calico/blob/master/felix/dataplane/linux/bpf_route_mgr.go#L242 if there was any change and send labels and hostIPs as a single update.

kp.lock.Unlock()

// Check if labels actually changed
labelsChanged := len(oldLabels) != len(labels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to use maps.Equal for this to simplify a bit

Copy link
Contributor

@tomastigera tomastigera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this would only exclude external traffic from being handled by the nodeport. It would not exclude a pod from reaching the nodeport via the excluded host's IP. But that should not really matter as there is no forwarding from that IP's node to the node with the backend. Nodeport is always resolved straight to the backend pod IP on the client pod's node.

// Trigger a resync with the current host IPs
// This will recreate the syncer with updated node labels
if len(hostIPs) > 0 {
kp.OnHostIPsUpdate(hostIPs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I was going to say that we should perhaps use the same/similar approach as when updating the host IPs. Both is expected to be rather infrequent event.

defer kp.lock.Unlock()

kp.lastHostIPs = hostIPs
kp.nodeLabels = kp.fetchNodeLabels()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not need to fetch the labels, you already got them in OnNodeLabelsUpdate()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, agreed. I am updating it. I got a bit confused there. Thanks.

// Trigger a resync with the current host IPs
// This will recreate the syncer with updated node labels
if len(hostIPs) > 0 {
kp.OnHostIPsUpdate(hostIPs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow the OnHostIPsUpdate example. To avoid possibly restarting kube-proxy 2x, it might be the best to move https://github.com/projectcalico/calico/blob/master/felix/dataplane/linux/bpf_route_mgr.go#L590 to https://github.com/projectcalico/calico/blob/master/felix/dataplane/linux/bpf_route_mgr.go#L242 if there was any change and send labels and hostIPs as a single update.

@tomastigera
Copy link
Contributor

Also note that bumping k8s libs in current master kubernetes/kubernetes#133059 we no need to incorporate node manager kubernetes/kubernetes#133059 so you may get all you need in a much simpler way if you hold off a little bit.

@tomastigera
Copy link
Contributor

Turns out that the node selector is not that helpful after all 😞 so we would need to send the labels as I proposed. We would like to decouple the callback and the proxy execution so that reprogramming kube-proxy would not block the rest of felix from running.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2026

This PR is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stale Issues without recent activity label Jan 3, 2026
@tomastigera
Copy link
Contributor

@tanujd11 just to follow up @aaaaaaaalex is now working on some of the base work for this - it did not get completely off our radar.

@github-actions github-actions bot removed the stale Issues without recent activity label Feb 5, 2026
@tomastigera
Copy link
Contributor

@tanujd11 what is the exact scenario? Excluding exposing a port on a certain node is sort of like placing a deny policy that would block access to the port on selected nodes. Does you LB already understand which nodes not to use to target the service? I guess you do not want to run the backend pods on these nodes and you have a way how to do that. I guess your LB would only target the nodes with the pods.

So is this change just to not implemented the nodeport on these nodes to make things clean? Or to leave the nodeport for the use of other services on these nodes?

@lwr20
Copy link
Member

lwr20 commented Feb 24, 2026

@tanujd11 what is the exact scenario? Excluding exposing a port on a certain node is sort of like placing a deny policy that would block access to the port on selected nodes. Does you LB already understand which nodes not to use to target the service? I guess you do not want to run the backend pods on these nodes and you have a way how to do that. I guess your LB would only target the nodes with the pods.

So is this change just to not implemented the nodeport on these nodes to make things clean? Or to leave the nodeport for the use of other services on these nodes?

We even have this use-case in a policy tutorial in the docs: https://docs.tigera.io/calico/latest/network-policy/services/kubernetes-node-ports#allow-ingress-traffic-to-specific-node-ports

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

select nodes to expose NodePort service in eBPF mode

6 participants