-
Notifications
You must be signed in to change notification settings - Fork 10
apps sc: S3 quota alerts per bucket #2568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0c15241
to
d5674b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, I think the templating, config and schema looks great, added some potential improvements and one concern regarding the Bucket36hActivityCheck
alert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is another alert Bucket36hActivityCheck
defined in this file which also uses the exclude
list which we probably still would want for the specified buckets, should perhaps the exclusion for buckets listed in s3BucketAlerts.buckets
be done in the templating instead of manually adding them to the exclude
list, or, should we also have configurable alerts per bucket for Bucket36hActivityCheck
as well? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I added templating for the exclusion of buckets so it creates a combined list for .exclude and names of .buckets. I don't think separate alerts per bucket for the activity check is necessary since there is not any configureable value there like number of objects or size.
Was trying to make a helper function for the templating but ended up spending too much time with debugging templating map/list issues due to that 😩
Ended up just doing it in the template file, which I think was just fine since it might have been overkill with a helper function either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Schema looks good, with one tweak.
d5674b3
to
94502b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just added some comments about updating comments 😅
@@ -450,6 +450,18 @@ prometheus: | |||
percent: 80 | |||
count: 1638400 | |||
exclude: [] | |||
# Custom per-bucket alerts | |||
buckets: [] | |||
# - name: <cluster>-thanos # Also add name to exclude from regular alerts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the most recent changes you do not need to add bucket names to the exclude list right?
# - name: <cluster>-thanos # Also add name to exclude from regular alerts. | |
# - name: <cluster>-thanos # This gets excluded from regular object storage alerts |
@@ -18,6 +18,16 @@ s3BucketAlerts: | |||
percent: 80 | |||
count: 1638400 | |||
exclude: [] | |||
buckets: [] | |||
# - name: <cluster>-thanos # Also add name to exclude from regular alerts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same thing here:
# - name: <cluster>-thanos # Also add name to exclude from regular alerts. | |
# - name: <cluster>-thanos # This gets excluded from regular object storage alerts |
Added some ops people if they have some opinions on the alerts 🙂 |
Warning
This is a public repository, ensure not to disclose:
What kind of PR is this?
Required: Mark one of the following that is applicable:
Optional: Mark one or more of the following that are applicable:
Important
Breaking changes should be marked
kind/admin-change
orkind/dev-change
depending on typeCritical security fixes should be marked with
kind/security
What does this PR do / why do we need this PR?
...
Information to reviewers
Would be happy if anyone got any suggestions/improvements to the schema titles/descriptions for the new buckets section.
I changed how the S3 Bucket Alert expression works in general because of how it was leaving "gaps" of data when you query the expression. This was due to while testing I noticed how changing the for: 1h to for: 10m would make the alert start pending but then return to Normal since there was no data (when it should have triggered). If you see any issue with it let me know. See below for example.
New query/expression

Old query/expression

Example per bucket alerts (sc-config)
Checklist