Skip to content
This repository has been archived by the owner on Jan 19, 2024. It is now read-only.

Support prometheus deployed via prometheus-operator #193

Open
jaylevin opened this issue Nov 18, 2021 · 8 comments
Open

Support prometheus deployed via prometheus-operator #193

jaylevin opened this issue Nov 18, 2021 · 8 comments
Labels
type:feature New feature or request that provides value to the stakeholders/end-users

Comments

@jaylevin
Copy link

jaylevin commented Nov 18, 2021

This issue is to address the incompatibility between the Keptn prometheus-service and prometheus deployed via the Prometheus Operator

Currently, the keptn prometheus-service depends on reading/writing to both the prometheus and alert-manager ConfigMap that are deployed as part of the Prometheus Community helm chart. However, when Prometheus is deployed on K8s via the prometheus-operator, these ConfigMaps do not exist.

Instead, (from my very limited understanding) the prometheus-operator watches for ServiceMonitor CRs in order to configure new scrape jobs. The prometheus-service keptn integration should ideally be able to handle the deployment of these CRs in order to create new scrape jobs for each service/project/stage that is configured to be monitored.

@christian-kreuzberger-dtx
Copy link
Contributor

Right now our recommendation is to install prometheus via the official helm chart:

kubectl create namespace monitoring
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/prometheus --namespace monitoring

here the configmap alraedy exists and just needs to be overwritten. Though we're having some issues with that, see #240

Do you think the operator is better suited for this? But would that mean we can no longer support "classical" prometheus on Kubernetes installations?

@christian-kreuzberger-dtx
Copy link
Contributor

The following slack discussion https://keptn.slack.com/archives/CNRCGFU3U/p1643028340100100 reveals that we are not compatible with the Prometheus operator.

It seems that the names of services and pods/deployments have changed:

$ kubectl -n monitoring get all
NAME                                                          READY   STATUS              RESTARTS   AGE
pod/alertmanager-prometheus-operator-alertmanager-0           0/2     ContainerCreating   0          10d
pod/metrics-server-85496d4f7c-djzjj                           1/1     Running             0          10d
pod/prometheus-operator-grafana-588d549949-x2tg8              2/2     Running             2          59d
pod/prometheus-operator-grafana-test                          0/1     Completed           0          59d
pod/prometheus-operator-kube-state-metrics-64d56fc9df-wp8wc   1/1     Running             0          10d
pod/prometheus-operator-operator-7fb8c9f85c-2nvgh             2/2     Running             0          10d
pod/prometheus-operator-prometheus-node-exporter-4flc8        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-5lw7d        1/1     Running             4          111d
pod/prometheus-operator-prometheus-node-exporter-72w6s        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-7b6p2        1/1     Running             2          111d
pod/prometheus-operator-prometheus-node-exporter-9rx2g        1/1     Running             3          111d
pod/prometheus-operator-prometheus-node-exporter-q66cl        1/1     Running             2          89d
pod/prometheus-operator-prometheus-node-exporter-rmd9j        1/1     Running             6          89d
pod/prometheus-operator-prometheus-node-exporter-s4f5d        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-v6vbz        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-vbtlh        1/1     Running             1          111d
pod/prometheus-operator-prometheus-node-exporter-x6vhm        1/1     Running             2          111d
pod/prometheus-operator-prometheus-node-exporter-xccx9        1/1     Running             1          111d
pod/prometheus-prometheus-operator-prometheus-0               3/3     Running             3          58d
pod/telegraf-daemonset-5f9cv                                  2/2     Running             2          111d
pod/telegraf-daemonset-7c2xn                                  2/2     Running             2          111d
pod/telegraf-daemonset-7gxcb                                  2/2     Running             2          111d
pod/telegraf-daemonset-7nfwl                                  2/2     Running             2          111d
pod/telegraf-daemonset-9225k                                  2/2     Running             4          111d
pod/telegraf-daemonset-cqjd5                                  2/2     Running             2          111d
pod/telegraf-daemonset-dp2hb                                  2/2     Running             4          111d
pod/telegraf-daemonset-fhc9m                                  2/2     Running             6          111d
pod/telegraf-daemonset-hjqmj                                  2/2     Running             12         89d
pod/telegraf-daemonset-ljsxf                                  2/2     Running             4          111d
pod/telegraf-daemonset-njsxx                                  2/2     Running             4          89d
pod/telegraf-daemonset-vhhl8                                  2/2     Running             2          111d
pod/telegraf-deployment-6448f95b55-gn4ph                      1/1     Running             0          10d

NAME                                                   TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-operated                          ClusterIP   None            <none>        9093/TCP,9094/TCP,9094/UDP   111d
service/metrics-server                                 ClusterIP   10.233.32.242   <none>        443/TCP                      105d
service/prometheus-operated                            ClusterIP   None            <none>        9090/TCP                     111d
service/prometheus-operator-alertmanager               ClusterIP   10.233.19.107   <none>        9093/TCP                     111d
service/prometheus-operator-grafana                    ClusterIP   10.233.42.175   <none>        80/TCP                       111d
service/prometheus-operator-kube-state-metrics         ClusterIP   10.233.34.37    <none>        8080/TCP                     111d
service/prometheus-operator-operator                   ClusterIP   10.233.57.170   <none>        8080/TCP,443/TCP             111d
service/prometheus-operator-prometheus                 ClusterIP   10.233.35.123   <none>        9090/TCP                     111d
service/prometheus-operator-prometheus-node-exporter   ClusterIP   10.233.36.118   <none>        9100/TCP                     111d
service/telegraf-deployment                            ClusterIP   10.233.37.51    <none>        9273/TCP                     111d

NAME                                                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/prometheus-operator-prometheus-node-exporter   12        12        12      12           12          <none>          111d
daemonset.apps/telegraf-daemonset                             12        12        12      12           12          <none>          111d

NAME                                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/metrics-server                           1/1     1            1           105d
deployment.apps/prometheus-operator-grafana              1/1     1            1           111d
deployment.apps/prometheus-operator-kube-state-metrics   1/1     1            1           111d
deployment.apps/prometheus-operator-operator             1/1     1            1           111d
deployment.apps/telegraf-deployment                      1/1     1            1           111d

NAME                                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-85496d4f7c                           1         1         1       105d
replicaset.apps/prometheus-operator-grafana-588d549949              1         1         1       59d
replicaset.apps/prometheus-operator-grafana-5c86cf65f9              0         0         0       59d
replicaset.apps/prometheus-operator-grafana-7857784dcd              0         0         0       111d
replicaset.apps/prometheus-operator-kube-state-metrics-64d56fc9df   1         1         1       111d
replicaset.apps/prometheus-operator-operator-7fb8c9f85c             1         1         1       111d
replicaset.apps/telegraf-deployment-6448f95b55                      1         1         1       111d

NAME                                                             READY   AGE
statefulset.apps/alertmanager-prometheus-operator-alertmanager   0/1     111d
statefulset.apps/prometheus-prometheus-operator-prometheus       1/1     111d

NAME                                             COMPLETIONS   DURATION   AGE
job.batch/prometheus-operator-admission-create   1/1           5s         111d
job.batch/prometheus-operator-admission-patch    1/1           94s        111d

We are looking for

service/prometheus-server               ClusterIP   10.24.45.75    <none>        80/TCP     37d
service/prometheus-alertmanager         ClusterIP   10.24.32.99    <none>        80/TCP     37d

in prometheus-service, but those are not available.

@christian-kreuzberger-dtx christian-kreuzberger-dtx added the type:feature New feature or request that provides value to the stakeholders/end-users label Jan 24, 2022
@bradmccoydev
Copy link

@christian-kreuzberger-dtx FYI I am currently doing analysis on this, as I would like to use the operator also.

@christian-kreuzberger-dtx
Copy link
Contributor

Sure! Please post your findings here! Looping in @thisthat and @oleg-nenashev on this change.

@oleg-nenashev
Copy link

+1. I will add it to my watch list for Keptn LTS

@jheyduk
Copy link

jheyduk commented Nov 1, 2022

Is anybody working on this? I would give it a try.

@bradmccoydev
Copy link

My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting

@ranyhb
Copy link

ranyhb commented Aug 24, 2023

My recommendation for this is that folks using the operator then they can BYO their own configuration and don't use the Keptn configure monitoring. And the get sli will work. I can present it at the developer meeting

so what do you suggest doing?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type:feature New feature or request that provides value to the stakeholders/end-users
Projects
None yet
Development

No branches or pull requests

6 participants