-
Notifications
You must be signed in to change notification settings - Fork 126
RSDK-12103: Add failing modules to error for failed API Model #5461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
module/modmanager/manager.go
Outdated
| moduleLogger.CErrorw(ctx, "Error adding module", "module", conf.Name, "error", err) | ||
| mgr.muFailedModules.Lock() | ||
| mgr.failedModules[conf.Name] = true | ||
| mgr.muFailedModules.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add modules to failedModules if failed to add
module/modmanager/manager.go
Outdated
| errs[i] = err | ||
| return | ||
| } | ||
| mgr.muFailedModules.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a module gets successfully added, check to see if its on list and delete it from there if is
|
this looks good - please add some regression tests for each of the cases you tested manually! |
|
Agree with Cheuk, and I do see some nil pointer exception in the test failures. |
cheukt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new tests is a great start, can we also add some integration tests? in particular, annotating at least some of the tests in module_lifecycle_test.go (the ones with crashes in them in particular) with checks on failed modules would be good. You can get the local robot in those tests with r.(*localRobot)
Good point, going to move all my tests into module_lifecycle_test.go so can have access to the localRobot to better replicate real conditions rather than calling isolated functions like Add() and UpdateFailedModules(). Will only test my Add/Remove functions |
| // to reconfigure. | ||
| if err := mod.Validate(""); err != nil { | ||
| manager.logger.CErrorw(ctx, "module config validation error; skipping", "module", mod.Name, "error", err) | ||
| manager.moduleManager.AddToFailedModules(mod.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you update the module with in invalid exec, it won't run reconfigure so manually add it
Whenever a module fails to startup, add it to the failing modules map
When a module is failing, it doesn't get stored in mgr.modules, so when you compare config diffs it won't find the difference between the previous config and the newConfig (where you removed a failing module). So to be able to tell when the failing module has been removed, I compare my failing modules list directly to the newConfig and remove any difference in the failing modules list.
When modules are removed for other reasons, it also removes them from the failing modules.
TLDR: testing fixing a failing local module, non-local module, and deleting modules
Testing:
Adding failing module
Original state: No failing modules and some working modules:
11/10/2025, 1:02:53 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:sAdding a failing local module (called local-module-1):
11/14/2025, 2:55:12 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; May be in failing module: [local-module-1]; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:sAdd a failing module using hot reload
11/14/2025, 2:56:20 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; May be in failing module: [local-module-1 allisonorg_hot-reload-module_from_reload]; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:sFixing failing module
4. Fix the failing local module
11/14/2025, 2:57:57 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; May be in failing module: [allisonorg_hot-reload-module_from_reload]; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:s11/14/2025, 2:59:04 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:sDeleting Failing Module
6. Re-add Failing module
11/14/2025, 3:00:37 PM error rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; May be in failing module: [local-module-1]; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:s11/14/2025, 3:01:27 PM error 'rdk.resource_manager.rdk:component:board/board-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:component:board with model s:s:s not registered; There may be no module in config that provides this model resource rdk:component:board/board-1 model s:s:sMake a working module fails
7. reload a working module with a broken link
11/19/2025, 1:38:14 PM error rdk.resource_manager.rdk:service:discovery/discovery-1 resource/graph_node.go:308 resource build error: unknown resource type: API rdk:service:discovery with model viam:find-webcams:webcam-discovery not registered; May be in failing module: [workingmodule]; There may be no module in config that provides this model resource rdk:service:discovery/discovery-1 model viam:find-webcams:webcam-discovery