Support prometheus deployed via prometheus-operator #193

jaylevin · 2021-11-18T23:23:35Z

This issue is to address the incompatibility between the Keptn prometheus-service and prometheus deployed via the Prometheus Operator

Currently, the keptn prometheus-service depends on reading/writing to both the prometheus and alert-manager ConfigMap that are deployed as part of the Prometheus Community helm chart. However, when Prometheus is deployed on K8s via the prometheus-operator, these ConfigMaps do not exist.

Instead, (from my very limited understanding) the prometheus-operator watches for ServiceMonitor CRs in order to configure new scrape jobs. The prometheus-service keptn integration should ideally be able to handle the deployment of these CRs in order to create new scrape jobs for each service/project/stage that is configured to be monitored.

The text was updated successfully, but these errors were encountered:

christian-kreuzberger-dtx · 2021-12-17T08:37:37Z

Right now our recommendation is to install prometheus via the official helm chart:

kubectl create namespace monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus --namespace monitoring

here the configmap alraedy exists and just needs to be overwritten. Though we're having some issues with that, see #240

Do you think the operator is better suited for this? But would that mean we can no longer support "classical" prometheus on Kubernetes installations?

christian-kreuzberger-dtx · 2022-01-24T15:06:32Z

The following slack discussion https://keptn.slack.com/archives/CNRCGFU3U/p1643028340100100 reveals that we are not compatible with the Prometheus operator.

It seems that the names of services and pods/deployments have changed:

$ kubectl -n monitoring get all
NAME                                                          READY   STATUS              RESTARTS   AGE
pod/alertmanager-prometheus-operator-alertmanager-0           0/2     ContainerCreating   0          10d
pod/metrics-server-85496d4f7c-djzjj                           1/1     Running             0          10d
pod/prometheus-operator-grafana-588d549949-x2tg8              2/2     Running             2          59d
pod/prometheus-operator-grafana-test                          0/1     Completed           0          59d
pod/prometheus-operator-kube-state-metrics-64d56fc9df-wp8wc   1/1     Running             0          10d
pod/prometheus-operator-operator-7fb8c9f85c-2nvgh             2/2     Running             0          10d
pod/prometheus-operator-prometheus-node-exporter-4flc8        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-5lw7d        1/1     Running             4          111d
pod/prometheus-operator-prometheus-node-exporter-72w6s        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-7b6p2        1/1     Running             2          111d
pod/prometheus-operator-prometheus-node-exporter-9rx2g        1/1     Running             3          111d
pod/prometheus-operator-prometheus-node-exporter-q66cl        1/1     Running             2          89d
pod/prometheus-operator-prometheus-node-exporter-rmd9j        1/1     Running             6          89d
pod/prometheus-operator-prometheus-node-exporter-s4f5d        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-v6vbz        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-vbtlh        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-x6vhm        1/1     Running             2          111d
pod/prometheus-operator-prometheus-node-exporter-xccx9        1/1     Running             1          111d
pod/prometheus-prometheus-operator-prometheus-0               3/3     Running             3          58d
pod/telegraf-daemonset-5f9cv                                  2/2     Running             2          111d
pod/telegraf-daemonset-7c2xn                                  2/2     Running             2          111d
pod/telegraf-daemonset-7gxcb                                  2/2     Running             2          111d
pod/telegraf-daemonset-7nfwl                                  2/2     Running             2          111d
pod/telegraf-daemonset-9225k                                  2/2     Running             4          111d
pod/telegraf-daemonset-cqjd5                                  2/2     Running             2          111d
pod/telegraf-daemonset-dp2hb                                  2/2     Running             4          111d
pod/telegraf-daemonset-fhc9m                                  2/2     Running             6          111d
pod/telegraf-daemonset-hjqmj                                  2/2     Running             12         89d
pod/telegraf-daemonset-ljsxf                                  2/2     Running             4          111d
pod/telegraf-daemonset-njsxx                                  2/2     Running             4          89d
pod/telegraf-daemonset-vhhl8                                  2/2     Running             2          111d
pod/telegraf-deployment-6448f95b55-gn4ph                      1/1     Running             0          10d

NAME                                                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-operated                          ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   111d
service/metrics-server                                 ClusterIP   10.233.32.242   <none>        443/TCP                      105d
service/prometheus-operated                            ClusterIP   None            <none>        9090/TCP                     111d
service/prometheus-operator-alertmanager               ClusterIP   10.233.19.107   <none>        9093/TCP                     111d
service/prometheus-operator-grafana                    ClusterIP   10.233.42.175   <none>        80/TCP                       111d
service/prometheus-operator-kube-state-metrics         ClusterIP   10.233.34.37    <none>        8080/TCP                     111d
service/prometheus-operator-operator                   ClusterIP   10.233.57.170   <none>        8080/TCP,443/TCP             111d
service/prometheus-operator-prometheus                 ClusterIP   10.233.35.123   <none>        9090/TCP                     111d
service/prometheus-operator-prometheus-node-exporter   ClusterIP   10.233.36.118   <none>        9100/TCP                     111d
service/telegraf-deployment                            ClusterIP   10.233.37.51    <none>        9273/TCP                     111d

NAME                                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-operator-prometheus-node-exporter   12        12        12      12           12          <none>          111d
daemonset.apps/telegraf-daemonset                             12        12        12      12           12          <none>          111d

NAME                                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/metrics-server                           1/1     1            1           105d
deployment.apps/prometheus-operator-grafana              1/1     1            1           111d
deployment.apps/prometheus-operator-kube-state-metrics   1/1     1            1           111d
deployment.apps/prometheus-operator-operator             1/1     1            1           111d
deployment.apps/telegraf-deployment                      1/1     1            1           111d

NAME                                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-85496d4f7c                           1         1         1       105d
replicaset.apps/prometheus-operator-grafana-588d549949              1         1         1       59d
replicaset.apps/prometheus-operator-grafana-5c86cf65f9              0         0         0       59d
replicaset.apps/prometheus-operator-grafana-7857784dcd              0         0         0       111d
replicaset.apps/prometheus-operator-kube-state-metrics-64d56fc9df   1         1         1       111d
replicaset.apps/prometheus-operator-operator-7fb8c9f85c             1         1         1       111d
replicaset.apps/telegraf-deployment-6448f95b55                      1         1         1       111d

NAME                                                             READY   AGE
statefulset.apps/alertmanager-prometheus-operator-alertmanager   0/1     111d
statefulset.apps/prometheus-prometheus-operator-prometheus       1/1     111d

NAME                                             COMPLETIONS   DURATION   AGE
job.batch/prometheus-operator-admission-create   1/1           5s         111d
job.batch/prometheus-operator-admission-patch    1/1           94s        111d

We are looking for

service/prometheus-server               ClusterIP   10.24.45.75    <none>        80/TCP     37d
service/prometheus-alertmanager         ClusterIP   10.24.32.99    <none>        80/TCP     37d

in prometheus-service, but those are not available.

bradmccoydev · 2022-09-27T09:00:24Z

@christian-kreuzberger-dtx FYI I am currently doing analysis on this, as I would like to use the operator also.

christian-kreuzberger-dtx · 2022-09-28T07:46:14Z

Sure! Please post your findings here! Looping in @thisthat and @oleg-nenashev on this change.

oleg-nenashev · 2022-09-28T08:13:59Z

+1. I will add it to my watch list for Keptn LTS

jheyduk · 2022-11-01T10:18:43Z

Is anybody working on this? I would give it a try.

bradmccoydev · 2022-11-01T23:34:34Z

My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting

ranyhb · 2023-08-24T05:46:16Z

My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting

so what do you suggest doing?

christian-kreuzberger-dtx added the type:feature New feature or request that provides value to the stakeholders/end-users label Jan 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support prometheus deployed via prometheus-operator #193

Support prometheus deployed via prometheus-operator #193

jaylevin commented Nov 18, 2021 •

edited

Loading

christian-kreuzberger-dtx commented Dec 17, 2021

christian-kreuzberger-dtx commented Jan 24, 2022

bradmccoydev commented Sep 27, 2022

christian-kreuzberger-dtx commented Sep 28, 2022

oleg-nenashev commented Sep 28, 2022

jheyduk commented Nov 1, 2022 •

edited

Loading

bradmccoydev commented Nov 1, 2022

ranyhb commented Aug 24, 2023

Support prometheus deployed via prometheus-operator #193

Support prometheus deployed via prometheus-operator #193

Comments

jaylevin commented Nov 18, 2021 • edited Loading

christian-kreuzberger-dtx commented Dec 17, 2021

christian-kreuzberger-dtx commented Jan 24, 2022

bradmccoydev commented Sep 27, 2022

christian-kreuzberger-dtx commented Sep 28, 2022

oleg-nenashev commented Sep 28, 2022

jheyduk commented Nov 1, 2022 • edited Loading

bradmccoydev commented Nov 1, 2022

ranyhb commented Aug 24, 2023

jaylevin commented Nov 18, 2021 •

edited

Loading

jheyduk commented Nov 1, 2022 •

edited

Loading