-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨(backend) add prometheus metrics, liveness and readiness probe endpoints #562
base: main
Are you sure you want to change the base?
Conversation
4b6f8be
to
25b8d52
Compare
Provides Prometheus metrics, a custom metrics exporter, a CIDR filter for monitoring targeted views, and readiness/liveness probe endpoints for Kubernetes. Signed-off-by: lindenb1 <[email protected]>
25b8d52
to
755538d
Compare
@virgile-dev Hey Virgile, could you please review the Prometheus implementation? At the moment, it is designed as an on/off switch, so it should not interfere significantly (if at all) with your existing code. In addition to the Prometheus metrics, I have added the standard probes required by Kubernetes (readiness and liveness checks). I also included a CIDR filter decorator so that the metrics, readiness, and liveness endpoints are accessible only from specific networks, such as the private Kubernetes network where the application is running. |
@sampaccoud Can you or @AntoLC please do a review? |
if os.environ.get("MONITORING_PROBING", "False").lower() == "true": | ||
urlpatterns.append( | ||
path( | ||
"probes/liveness/", | ||
monitoring_cidr_protected_view(liveness_check), | ||
name="liveness-probe", | ||
), | ||
) | ||
urlpatterns.append( | ||
path( | ||
"probes/readiness/", | ||
monitoring_cidr_protected_view(readiness_check), | ||
name="readiness-probe", | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that this would be redundant with what we already have on /lbheartbeat and /heartbeat with Mozilla's dockerflow.
Can you check before I review the rest?
if doc_ages["oldest"]: | ||
metrics.append( | ||
GaugeMetricFamily( | ||
prefixed_metric_name("oldest_document_date"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's recommended to include "timestamp"in the metric name and have a _seconds suffix (from https://prometheus.io/docs/practices/naming/#metric-names)
prefixed_metric_name("oldest_document_date"), | |
prefixed_metric_name("oldest_document_timestamp_seconds"), |
if doc_ages["newest"]: | ||
metrics.append( | ||
GaugeMetricFamily( | ||
prefixed_metric_name("newest_document_date"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefixed_metric_name("newest_document_date"), | |
prefixed_metric_name("newest_document_timestamp_seconds"), |
|
||
# -- User document distribution | ||
user_distribution_metric = GaugeMetricFamily( | ||
prefixed_metric_name("user_document_distribution"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(suggestion) not sure that the "_distribution" is meaningful?
prefixed_metric_name("user_document_distribution"), | |
prefixed_metric_name("user_documents"), |
Provides Prometheus metrics, a custom metrics exporter, a CIDR filter for monitoring targeted views,
and readiness/liveness probe endpoints for Kubernetes.
Purpose
Simplify operational observability and health checking in containerized environments. By exposing both standard and custom metrics from Django for Prometheus, you can gain deeper insights into application performance. The integrated CIDR filter ensures that only allowed IPs can access the monitoring and probe endpoints of the application, and the built-in readiness/liveness probes provide health checks for Kubernetes to manage pod lifecycles effectively.
Proposal
Integrate these changes to benefit from metrics collection, and streamlined health checks. By centralizing these responsibilities, teams can reduce complexity and maintain a clearer view of application behavior within Kubernetes clusters and similar environments.