-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics suggestion: backup jobs, replication jobs #112
Comments
This exporter is using the PVE REST API. Looking through the API docs I have found the following interesting routes possibly covering your requirements (at least partly): absent backups: Regarding high IO delay I recommend to take a look at node_exporter. For node level metrics, this is usually the better option. |
thanks @znerol
while i originally meant backup jobs who for some reason didn't execute, i also like the idea of alerting when a VM doesn't have a backup job at all. for the rest i'll also have a look at the API to see which items would be useful to add.
indeed, thanks! i thought PVE was doing something special but according to the frontend code it evaluates the system's wait load, which can be gathered otherwise. |
Hello everyone, is there any progress? I faced a similar problem. I need to know which machines were left without backup, or there was an error. |
IO wait would be a very useful metric to have, IMO, if possible -- especially for those using ZFS for backing storage. |
Please use |
Thank you! |
Add replication metrics as requested in issue #112. * Replication Metrics are fetched per node * The metrics can be enabled or disabled Based on the original PR #166 adapted the new file structure. --------- Signed-off-by: Sven Gerber <[email protected]> Co-authored-by: znerol <[email protected]> Co-authored-by: Marian Koreniuk <[email protected]>
Thenks to @svengerber and @themoriarti, replication metrics are available as of release v3.3.0. |
hey @znerol, thank you for creating this helpful exporter 🙌
i'd like to track and set up alerts for failed or absent backups, replications, and on high IO delay (the one that's displayed in the webui for each node).
cheers 👋
The text was updated successfully, but these errors were encountered: