Skip to content

Adding null check for VTEPInfo and IPaddress#11028

Open
ajgupta42 wants to merge 2 commits intoprojectcalico:masterfrom
ajgupta42:anuj/null-patch
Open

Adding null check for VTEPInfo and IPaddress#11028
ajgupta42 wants to merge 2 commits intoprojectcalico:masterfrom
ajgupta42:anuj/null-patch

Conversation

@ajgupta42
Copy link

@ajgupta42 ajgupta42 commented Sep 17, 2025

Description

Missing both IPv4 and IPv6 VTEP information
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation]
github.com/projectcalico/calico/felix/calc.l3rrNodeInfo.AddressAsCIDR

/go/src/github.com/projectcalico/calico/felix/calc/l3_route_resolver.go:166

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Adding nil checks

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@ajgupta42 ajgupta42 requested a review from a team as a code owner September 17, 2025 10:29
@marvin-tigera marvin-tigera added this to the Calico v3.32.0 milestone Sep 17, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Sep 17, 2025
@CLAassistant
Copy link

CLAassistant commented Sep 17, 2025

CLA assistant check
All committers have signed the CLA.

@nelljerram
Copy link
Member

@ajgupta42 Please could you add more information:

  • the stack trace of the panic that you saw
  • any ideas about why this is happening in your cluster - bearing in mind that we've missed this for some reason in our existing testing.

@ajgupta42
Copy link
Author

ajgupta42 commented Sep 17, 2025

@nelljerram It was failing on AsCIDR, so added some nil checks to prevent seg fault
https://github.com/projectcalico/calico/pull/11028/files#diff-dfd7a82ed33eb6795753c3c8c92bddf7470585efe073b284384c4a11a557e8f8R171

IP address was blank for a node which caused this issue.

@nelljerram
Copy link
Member

Thanks @ajgupta42 but I don't fully understand yet.

  • Can you provide the full stack trace? That will help me understand the scenario overall.
  • What exactly do you mean by "IP address was blank for a node"?

@ajgupta42
Copy link
Author

ajgupta42 commented Sep 18, 2025

Thanks @nelljerram for the quick response, highly appreciated.
Please find the full stack trace

goroutine 582 [runningl:
github.com/projectcalico/calico/felix/calc.13rrNodelnfo.AddressesAsCIDRs(f{0xa, Oxa, Ox20, 0x18}, {(0xa, Oxa, 0x20, 0x0}, Ox1a}, (ОxО,...])...)
/go/src/github.com/projectcalico/calico/felix/calc/13_route_resolver.go:166 +0x309
github.com/projectcalico/calico/felix/calc.(*L3RouteResolver).onNodeUpdate(0xc00103e990, {0xc000bd352d, Oxf), Oxc001319978)
/go/src/github.com/projectcalico/calico/felix/calc/13_route_resolver.go:578 +0xe25
github.com/projectcalico/calico/felix/calc.(*L3RouteResolver).OnResourceUpdate(0xc00103e990, ii(0x509faa0, 0xc001410630}, (0x488d780, 0xc000a036c0}, {0xc001164900,
0x9}, Ox0, OxO}, Ox1})
/go/src/github.com/projectcalico/calico/felix/calc/13_route_resolver.go:442 +0xae5
github.com/projectcalico/calico/felix/dispatcher.updateHandlers.DispatchToAlI(..)
/go/src/github.com/projectcalico/calico/felix/dispatcher/dispatcher.go:45
github.com/projectcalico/calico/felix/dispatcher.(*Dispatcher).OnUpdate(0xc000072f20, (l{0x509faaO, 0xc001410630}, {0x488d780,
Oxc000a036c0}, (0xc001164900, 0x9}, 0x0,
OxO}, 0x1))
/go/src/github.com/projectcalico/calico/felix/dispatcher/dispatcher.go:105 +0x395
github.com/projectcalico/calico/felix/dispatcher.(*Dispatcher).OnUpdates(...)
/go/src/github.com/projectcalico/calico/felix/dispatcher/dispatcher.go:80
github.com/projectcalico/calico/felix/calc. (*CalcGraph).OnUpdates(...)
/go/src/github.com/projectcalico/calico/felix/calc/calc_graph.go:137
github.com/projectcalico/calico/felix/calc.(*AsyncCalcGraph).loop(0xc0006a5500)
/go/src/github.com/projectcalico/calico/felix/calc/async_calc_graph.go:151 +0x75c
created by github.com/projectcalico/calico/felix/calc.(*AsyncCalcGraph).Start in goroutine 1 /go/src/github.com/projectcalico/calico/felix/calc/async_calc_graph.go:259 +0xf8

What exactly do you mean by "IP address was blank for a node"?
> This was an edge-case scenario. Due to a bug, the node’s ExternalIP became blank (it wasn’t set to <none>). We had to manually clean it up & set to <none>. I believe we can handle this with a nil check. Please let me know if I’m missing anything or if there are other edge cases we should cover.

@nelljerram
Copy link
Member

@ajgupta42 Thank you for your last response. I'm afraid I'm still a little unclear about the scenario, though, and I also realized that we should have a test for it - which would be an excellent way of clarifying the scenario. It sounds like an FV test would be appropriate for this situation, so could you look at adding an FV test? Please let me know if you need more info about how our FV tests work.

@nelljerram
Copy link
Member

@ajgupta42 You've added UTs - thank you - but unfortunately they don't help to explain the overall scenario, and that's why I suggested an FV test.

@github-actions
Copy link
Contributor

This PR is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stale Issues without recent activity label Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small) stale Issues without recent activity

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants