-
-
Notifications
You must be signed in to change notification settings - Fork 922
[client] Block on all subsystems on shutdown #4709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a proper shutdown mechanism for the NetBird client by adding WaitGroups to track and wait for all long-running goroutines across multiple subsystems. This ensures clean shutdown by blocking until all goroutines complete, replacing a hardcoded 500ms sleep with proper synchronization.
Key Changes:
- Added
shutdownWgWaitGroups to track goroutines in route manager, DNS server, netflow manager, SR watcher, and engine - Implemented graceful shutdown with configurable timeout (10s base + 100ms per peer, capped at 30s)
- Refactored mutex unlock patterns to release locks before blocking on WaitGroup
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| client/internal/engine.go | Added central shutdown WaitGroup, timeout logic, and tracking for all engine goroutines; renamed restartEngine to triggerClientRestart for clarity |
| client/internal/routemanager/manager.go | Added WaitGroup to track client network watcher goroutines |
| client/internal/peer/guard/sr_watcher.go | Added WaitGroup for ICE monitor and fixed mutex unlock pattern |
| client/internal/netflow/manager.go | Added WaitGroup for ACK receiver and sender goroutines |
| client/internal/dns/server.go | Added WaitGroup to track DNS state persistence goroutine |
| client/internal/connect.go | Refactored engine stop logic to avoid holding mutex during shutdown |
Comments suppressed due to low confidence (1)
client/internal/peer/guard/sr_watcher.go:1
- The early return path unlocks the mutex manually while the normal path unlocks at line 74. This asymmetric unlock pattern is error-prone. Consider using a deferred unlock at the beginning and removing the explicit unlock at line 74, or restructure the logic to avoid mixed unlock patterns.
package guard
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| iceMonitor := NewICEMonitor(w.iFaceDiscover, w.iceConfig, GetICEMonitorPeriod()) | ||
| go iceMonitor.Start(ctx, w.onICEChanged) | ||
| w.shutdownWg.Add(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think it makes sense to wait for this. This struct is already used by the guard code, which waits for the exit and removes the listeners beforehand. It has no effect if it keeps running in the background for a while. This wait group only slows down the restart, and restart speed will be important on mobile due to route changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted that part
|



Describe your changes
Issue ticket number and link
Stack
Checklist
Documentation
Select exactly one:
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__