-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Route-map: null
when loading config during FRR restart
#17552
Comments
This means it's connected later than the configuration is processed.
How do you know that? Can you get a backtrace or even a coredump? Also, what is the configuration? |
In the past i have seen this
I have added a modified branch to catch the coredump but is not happening. As I said not very often. How can I get a backtrace? Also for completeness is there a documentation how to catch the coredump (or the generic way)? The config that frr-reloader script loads is
|
To be sure, is this happening (a crash with 10.2 or master)? Or what happens if you move prefix-list and route-map definitions before "router bgp"? |
This is happening with
I might be mistaken but the frr-reloader script seems to re-organizing the input so what order we have is not the final order. Is that statement correct? |
Can't test with |
Is it possible to somehow replicate this deterministically? |
We try but for the time being we cannot reproduce deterministically unfortunately.
Can you expand on, I am not sure I understand who is connected, who is processing the config, thanks |
Ignore my previous comment, I just realized the timing is different here:
and later:
@karampok is this somehow related to this issue also #16367? |
@karampok could you repeat again and send us the logs with the |
Description
In MetalLB we use FRR inside a kubernetes pod and in our CI we observe not very often a failure.
Upon a pod restart (and therefore an FRR restart), the FRR process is not sending out prefixes, and test fails.
What we observe is
while the frr config clearly should allow.
The full logs
indicate that the config was added (using the frr-reloader) before the bgpd is up, and probably bgpd deamon crashed (
Cannot stop bgpd: pid 166 not running
or bgpd was stopped). That is not transcient behavior, once it happens it remains like that.This looks like a bug because config should be inserted only if daemons are up.
What do you think?
Thank you!
Version
How to reproduce
It happens in our CI, we have not been able to reproduce it.
Expected behavior
The frr-reloader script should not be able to insert config unless daemon is up.
Actual behavior
frr-reloader script not to be able to add any config unless daemon are up.
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: