Skip to content

Commit 8f763cb

Browse files
JoaoBraveCodingrexagod
authored andcommitted
Fixed markdown test
1 parent 5b8069c commit 8f763cb

File tree

1 file changed

+31
-26
lines changed

1 file changed

+31
-26
lines changed

enhancements/monitoring/metrics-collection-profiles.md

Lines changed: 31 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,14 @@ tracking-link:
1818

1919
## Terms
2020

21-
monitors - refers to the CRDs ServiceMonitor, PodMonitor and Probe from Prometheus Operator;
21+
monitors - refers to the CRDs ServiceMonitor, PodMonitor and Probe from
22+
Prometheus Operator;
2223

23-
users - refers to end-users of OpenShift who manage an OpenShift installation i.e cluster-admins;
24+
users - refers to end-users of OpenShift who manage an OpenShift installation
25+
i.e cluster-admins;
2426

25-
developers - refers to OpenShift developers that build the platform i.e. RedHat associates and OpenSource contributors;
27+
developers - refers to OpenShift developers that build the platform i.e. RedHat
28+
associates and OpenSource contributors;
2629

2730

2831
## Summary
@@ -48,14 +51,14 @@ Nevertheless, users have repeatedly asked for the ability to reduce the amount
4851
of memory consumed by Prometheus either by lowering the Prometheus scrape
4952
intervals or by modifying monitors.
5053

51-
Users currently can not control the aforementioned monitors scraped by Prometheus
52-
since some of the metrics collected are essential for other parts of the system
53-
to function properly: recording rules, alerting rules, console dashboards, and
54-
Red Hat Telemetry. Users also are not allowed to tune the interval at which
55-
Prometheus scrapes targets as this again can have unforeseen results that can
56-
hinder the platform: a low scrape interval value may overwhelm the platform
57-
Prometheus instance while a high interval value may render some of the default
58-
alerts ineffective.
54+
Users currently can not control the aforementioned monitors scraped by
55+
Prometheus since some of the metrics collected are essential for other parts of
56+
the system to function properly: recording rules, alerting rules, console
57+
dashboards, and Red Hat Telemetry. Users also are not allowed to tune the
58+
interval at which Prometheus scrapes targets as this again can have unforeseen
59+
results that can hinder the platform: a low scrape interval value may overwhelm
60+
the platform Prometheus instance while a high interval value may render some of
61+
the default alerts ineffective.
5962

6063
The goal of this proposal is to allow users to pick their desired level of
6164
scraping while limiting the impact this might have on the platform, via
@@ -90,17 +93,17 @@ kubelet and the network daemon.
9093
- As a developer, I want a supported way to collect a subset of the metrics
9194
exported by my operator and operands, while still collecting necessary metrics
9295
for alerts, visualization of key indicators and Telemetry.
93-
- As a developer of a component (that does not yet implement a profile), I want to
94-
extract metrics needed to implement said profile, based on the assets I
96+
- As a developer of a component (that does not yet implement a profile), I want
97+
to extract metrics needed to implement said profile, based on the assets I
9598
provide, or the ones gathered from the cluster based on a group of target
9699
selectors, and a plug-in relabel configuration to apply within the monitor.
97-
- As a developer of a component (that does not, or only partially implements a profile),
98-
I want to get information about any monitors that are not yet implemented for
99-
any of the supported profiles that are offered.
100-
- As a developer of a component (that implements a profile), I want to verify if all the
101-
profile metrics are present in the cluster, and which of the profile monitors
102-
are affected if not. Also, I want additional information to narrow down where
103-
these metrics are exactly being used.
100+
- As a developer of a component (that does not, or only partially implements a
101+
profile), I want to get information about any monitors that are not yet
102+
implemented for any of the supported profiles that are offered.
103+
- As a developer of a component (that implements a profile), I want to verify if
104+
all the profile metrics are present in the cluster, and which of the profile
105+
monitors are affected if not. Also, I want additional information to narrow
106+
down where these metrics are exactly being used.
104107

105108
### Goals
106109

@@ -325,7 +328,8 @@ not. To aid teams with this effort the monitoring team will provide:
325328
resource (Alerts/PrometheusRules/Dashboards) using a metric which is not
326329
present in the monitor in question;
327330

328-
- What happens if a user provides an invalid value for a metrics collection profile?
331+
- What happens if a user provides an invalid value for a metrics collection
332+
profile?
329333
- CMO will reconcile and validate that the value supplied is invalid and it
330334
will report Degraded=False and fail reconciliation.
331335

@@ -352,8 +356,8 @@ not. To aid teams with this effort the monitoring team will provide:
352356
- Unit tests in CMO to validate that the correct monitors are being selected
353357
- E2E tests in CMO to validate that everything works correctly
354358
- For the `minimal` profile, origin/CI test to validate that every metric used
355-
in a resource (Alerts/PrometheusRules/Dashboards) exists in the `keep` expression
356-
of a minimal monitors.
359+
in a resource (Alerts/PrometheusRules/Dashboards) exists in the `keep`
360+
expression of a minimal monitors.
357361

358362
### Graduation Criteria
359363

@@ -450,10 +454,11 @@ Initial proofs-of-concept:
450454
- [Azure
451455
Docs](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/prometheus-metrics-scrape-configuration-minimal)
452456
- https://github.com/Azure/prometheus-collector
453-
- In their approach they also have [hardcoded](https://github.com/Azure/prometheus-collector/blob/66ed1a5a27781d7e7e3bb1771b11f1da25ffa79c/otelcollector/configmapparser/tomlparser-default-targets-metrics-keep-list.rb#L28)
457+
- In their approach they also have
458+
[hardcoded](https://github.com/Azure/prometheus-collector/blob/66ed1a5a27781d7e7e3bb1771b11f1da25ffa79c/otelcollector/configmapparser/tomlparser-default-targets-metrics-keep-list.rb#L28)
454459
set of metrics that are only consumed when the minimal profile is enabled.
455-
However, customers are also able to extend this minimal profile with regexes to
456-
include metrics which might be interesting to them.
460+
However, customers are also able to extend this minimal profile with regexes
461+
to include metrics which might be interesting to them.
457462
- Leverage [installer capabilities](https://docs.google.com/document/d/1I-YT7LKKDHSBLB6Hmg0tZ54DWjrAxlVdXxlViShMu-0/edit#heading=h.848jsje80fru)
458463
- After some consideration we decided to abandon this idea since it would only
459464
work for resources controlled by CVO which is not the case for the majority

0 commit comments

Comments
 (0)