|
| 1 | +--- |
| 2 | +keywords: [Kubernetes, Prometheus, monitoring, metrics, observability, GreptimeDB, Prometheus Operator, Grafana] |
| 3 | +description: Guide to monitoring Kubernetes metrics using Prometheus with GreptimeDB as the storage backend, including architecture overview, installation, and visualization with Grafana. |
| 4 | +--- |
| 5 | + |
| 6 | +# Monitor Kubernetes Metrics with Prometheus and GreptimeDB |
| 7 | + |
| 8 | +This guide demonstrates how to set up a complete Kubernetes monitoring solution using Prometheus for metrics collection and GreptimeDB as the long-term storage backend. |
| 9 | + |
| 10 | +## What is Kubernetes Monitoring |
| 11 | + |
| 12 | +Kubernetes monitoring is the practice of collecting, analyzing, and acting on metrics and logs from a Kubernetes cluster. |
| 13 | +It provides visibility into the health, performance, and resource utilization of your containerized applications and infrastructure. |
| 14 | + |
| 15 | +Key aspects of Kubernetes monitoring include: |
| 16 | + |
| 17 | +- **Resource Metrics**: CPU, memory, disk, and network usage for nodes, pods, and containers |
| 18 | +- **Cluster Health**: Status of cluster components like kube-apiserver, etcd, and controller-manager |
| 19 | +- **Application Metrics**: Custom metrics from your applications running in the cluster |
| 20 | +- **Events and Logs**: Kubernetes events and container logs for troubleshooting |
| 21 | + |
| 22 | +Effective monitoring helps you: |
| 23 | +- Detect and diagnose issues before they impact users |
| 24 | +- Optimize resource utilization and reduce costs |
| 25 | +- Plan capacity based on historical trends |
| 26 | +- Ensure SLA compliance |
| 27 | +- Troubleshoot performance bottlenecks |
| 28 | + |
| 29 | +## Architecture Overview |
| 30 | + |
| 31 | +The monitoring architecture consists of the following components: |
| 32 | + |
| 33 | + |
| 34 | + |
| 35 | +**Components:** |
| 36 | + |
| 37 | +- **kube-state-metrics**: Exports cluster-level metrics about Kubernetes objects (deployments, pods, services, etc.) |
| 38 | +- **Node Exporter**: Exports hardware and OS-level metrics from each Kubernetes node |
| 39 | +- **Prometheus Operator**: Automates Prometheus deployment and configuration using Kubernetes custom resources |
| 40 | +- **GreptimeDB**: Acts as the long-term storage backend for Prometheus metrics with high compression and query performance |
| 41 | +- **Grafana**: Provides dashboards and visualizations for metrics stored in GreptimeDB |
| 42 | + |
| 43 | +## Prerequisites |
| 44 | + |
| 45 | +Before starting, ensure you have: |
| 46 | + |
| 47 | +- A running Kubernetes cluster (version >= 1.18) |
| 48 | +- `kubectl` configured to access your cluster |
| 49 | +- [Helm](https://helm.sh/docs/intro/install/) v3.0.0 or higher installed |
| 50 | +- Sufficient cluster resources (at least 2 CPU cores and 4GB memory available) |
| 51 | + |
| 52 | +## Install GreptimeDB |
| 53 | + |
| 54 | +GreptimeDB serves as the long-term storage backend for Prometheus metrics. |
| 55 | +For detailed installation steps, |
| 56 | +please refer to the [Deploy GreptimeDB Cluster](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md) documentation. |
| 57 | + |
| 58 | +### Verify the GreptimeDB Installation |
| 59 | + |
| 60 | +After deploying GreptimeDB, verify that the cluster is running. |
| 61 | +In this guide we assume the GreptimeDB cluster is deployed in the `greptime-cluster` namespace and named `greptimedb`. |
| 62 | + |
| 63 | +```bash |
| 64 | +kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb |
| 65 | +``` |
| 66 | + |
| 67 | +```bash |
| 68 | +NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE |
| 69 | +greptimedb 1 2 1 1 Running v0.17.2 33s |
| 70 | +``` |
| 71 | + |
| 72 | +Check the pods: |
| 73 | + |
| 74 | +```bash |
| 75 | +kubectl get pods -n greptime-cluster |
| 76 | +``` |
| 77 | + |
| 78 | +```bash |
| 79 | +NAME READY STATUS RESTARTS AGE |
| 80 | +greptimedb-datanode-0 1/1 Running 0 71s |
| 81 | +greptimedb-datanode-1 1/1 Running 0 97s |
| 82 | +greptimedb-flownode-0 1/1 Running 0 64s |
| 83 | +greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s |
| 84 | +greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s |
| 85 | +``` |
| 86 | + |
| 87 | +### Access GreptimeDB |
| 88 | + |
| 89 | +To interact with GreptimeDB directly, you can port-forward the frontend service to your local machine. |
| 90 | +GreptimeDB supports multiple protocols, with MySQL protocol available on port `4002` by default. |
| 91 | + |
| 92 | +```bash |
| 93 | +kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002 |
| 94 | +``` |
| 95 | + |
| 96 | +Connect using any MySQL-compatible client: |
| 97 | + |
| 98 | +```bash |
| 99 | +mysql -h 127.0.0.1 -P 4002 |
| 100 | +``` |
| 101 | + |
| 102 | +### Storage Partitioning |
| 103 | + |
| 104 | +To improve query performance and reduce storage costs, |
| 105 | +GreptimeDB automatically creates columns based on Prometheus metric labels and stores metrics in a physical table. |
| 106 | +The default table name is `greptime_physical_table`. |
| 107 | +Since we deployed a GreptimeDB cluster with [multiple datanodes](#verify-the-greptimedb-installation), |
| 108 | +you can partition the table to distribute data across datanodes for better scalability and performance. |
| 109 | + |
| 110 | +In this Kubernetes monitoring scenario, we can use the `namespace` label as the partition key. |
| 111 | +For example, with namespaces like `kube-public`, `kube-system`, `monitoring`, `default`, `greptime-cluster`, and `etcd-cluster`, |
| 112 | +you can create a partitioning scheme based on the first letter of the namespace: |
| 113 | + |
| 114 | +```sql |
| 115 | +CREATE TABLE greptime_physical_table ( |
| 116 | + greptime_value DOUBLE NULL, |
| 117 | + namespace STRING PRIMARY KEY, |
| 118 | + greptime_timestamp TIMESTAMP TIME INDEX, |
| 119 | +) |
| 120 | +PARTITION ON COLUMNS (namespace) ( |
| 121 | + namespace < 'f', |
| 122 | + namespace >= 'f' AND namespace < 'g', |
| 123 | + namespace >= 'g' AND namespace < 'k', |
| 124 | + namespace >= 'k' |
| 125 | +) |
| 126 | +ENGINE = metric |
| 127 | +WITH ( |
| 128 | + "physical_metric_table" = "" |
| 129 | +); |
| 130 | +``` |
| 131 | + |
| 132 | +For more information about Prometheus metrics storage and query performance optimization, refer to the [Improve efficiency by using metric engine](/user-guide/ingest-data/for-observability/prometheus.md#improve-efficiency-by-using-metric-engine) guide. |
| 133 | + |
| 134 | +### Prometheus URLs in GreptimeDB |
| 135 | + |
| 136 | +GreptimeDB provides [Prometheus-compatible APIs](/user-guide/query-data/promql.md#prometheus-http-api) under the HTTP context `/v1/prometheus/`, |
| 137 | +enabling seamless integration with existing Prometheus workflows. |
| 138 | + |
| 139 | +To integrate Prometheus with GreptimeDB, you need the GreptimeDB service address. |
| 140 | +Since GreptimeDB runs inside the Kubernetes cluster, use the internal cluster address. |
| 141 | + |
| 142 | +The GreptimeDB frontend service address follows this pattern: |
| 143 | +``` |
| 144 | +<greptimedb-name>-frontend.<namespace>.svc.cluster.local:<port> |
| 145 | +``` |
| 146 | + |
| 147 | +In this guide: |
| 148 | +- GreptimeDB cluster name: `greptimedb` |
| 149 | +- Namespace: `greptime-cluster` |
| 150 | +- Frontend port: `4000` |
| 151 | + |
| 152 | +So the service address is: |
| 153 | +```bash |
| 154 | +greptimedb-frontend.greptime-cluster.svc.cluster.local:4000 |
| 155 | +``` |
| 156 | + |
| 157 | +The complete [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) for Prometheus is: |
| 158 | + |
| 159 | +```bash |
| 160 | +http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write |
| 161 | +``` |
| 162 | + |
| 163 | +This URL consists of: |
| 164 | +- **Service endpoint**: `greptimedb-frontend.greptime-cluster.svc.cluster.local:4000` |
| 165 | +- **API path**: `/v1/prometheus/write` |
| 166 | + |
| 167 | +## Install Prometheus |
| 168 | + |
| 169 | +Now that GreptimeDB is running, we'll install Prometheus to collect metrics and send them to GreptimeDB for long-term storage. |
| 170 | + |
| 171 | +### Add the Prometheus Community Helm Repository |
| 172 | + |
| 173 | +```bash |
| 174 | +helm repo add prometheus-community https://prometheus-community.github.io/helm-charts |
| 175 | +helm repo update |
| 176 | +``` |
| 177 | + |
| 178 | +### Install the kube-prometheus-stack |
| 179 | + |
| 180 | +The [`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) is a comprehensive monitoring solution that includes |
| 181 | +Prometheus, Grafana, kube-state-metrics, and node-exporter components. |
| 182 | +This stack automatically discovers and monitors all Kubernetes namespaces, |
| 183 | +collecting metrics from cluster components, nodes, and workloads. |
| 184 | + |
| 185 | +In this deployment, we'll configure Prometheus to use GreptimeDB as the remote write destination for long-term metric storage and configure Grafana's default Prometheus data source to use GreptimeDB. |
| 186 | + |
| 187 | +Create a `kube-prometheus-values.yaml` file with the following configuration: |
| 188 | + |
| 189 | +```yaml |
| 190 | +# Configure Prometheus remote write to GreptimeDB |
| 191 | +prometheus: |
| 192 | + prometheusSpec: |
| 193 | + remoteWrite: |
| 194 | + - url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write |
| 195 | + |
| 196 | +# Configure Grafana to use GreptimeDB as the default Prometheus data source |
| 197 | +grafana: |
| 198 | + datasources: |
| 199 | + datasources.yaml: |
| 200 | + apiVersion: 1 |
| 201 | + datasources: |
| 202 | + - name: Prometheus |
| 203 | + type: prometheus |
| 204 | + url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus |
| 205 | + access: proxy |
| 206 | + editable: true |
| 207 | +``` |
| 208 | +
|
| 209 | +This configuration file specifies [the GreptimeDB service address](#prometheus-urls-in-greptimedb) for: |
| 210 | +- **Prometheus remote write**: Sends collected metrics to GreptimeDB for long-term storage |
| 211 | +- **Grafana data source**: Configures GreptimeDB as the default Prometheus data source for dashboard queries |
| 212 | +
|
| 213 | +Install the `kube-prometheus-stack` using Helm with the custom values file: |
| 214 | + |
| 215 | +```bash |
| 216 | +helm install kube-prometheus prometheus-community/kube-prometheus-stack \ |
| 217 | + --namespace monitoring \ |
| 218 | + --create-namespace \ |
| 219 | + --values kube-prometheus-values.yaml |
| 220 | +``` |
| 221 | + |
| 222 | +### Verify the Installation |
| 223 | + |
| 224 | +Check that all Prometheus components are running: |
| 225 | + |
| 226 | +```bash |
| 227 | +kubectl get pods -n monitoring |
| 228 | +``` |
| 229 | + |
| 230 | +```bash |
| 231 | +NAME READY STATUS RESTARTS AGE |
| 232 | +alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s |
| 233 | +kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s |
| 234 | +kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s |
| 235 | +kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s |
| 236 | +kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s |
| 237 | +prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s |
| 238 | +``` |
| 239 | + |
| 240 | +### Verify the Monitoring Status |
| 241 | + |
| 242 | +Use [MySQL protocol](#access-greptimedb) to query GreptimeDB and verify that Prometheus metrics are being written. |
| 243 | + |
| 244 | +```sql |
| 245 | +SHOW TABLES; |
| 246 | +``` |
| 247 | + |
| 248 | +You should see tables created for various Prometheus metrics. |
| 249 | + |
| 250 | +```sql |
| 251 | ++---------------------------------------------------------------------------------+ |
| 252 | +| Tables | |
| 253 | ++---------------------------------------------------------------------------------+ |
| 254 | +| :node_memory_MemAvailable_bytes:sum | |
| 255 | +| ALERTS | |
| 256 | +| ALERTS_FOR_STATE | |
| 257 | +| aggregator_discovery_aggregation_count_total | |
| 258 | +| aggregator_unavailable_apiservice | |
| 259 | +| alertmanager_alerts | |
| 260 | +| alertmanager_alerts_invalid_total | |
| 261 | +| alertmanager_alerts_received_total | |
| 262 | +| alertmanager_build_info | |
| 263 | +| ...... | |
| 264 | ++---------------------------------------------------------------------------------+ |
| 265 | +1553 rows in set (0.18 sec) |
| 266 | +``` |
| 267 | + |
| 268 | +## Use Grafana for Visualization |
| 269 | + |
| 270 | +Grafana is included in the kube-prometheus-stack and comes pre-configured with dashboards for comprehensive Kubernetes monitoring. |
| 271 | + |
| 272 | +### Access Grafana |
| 273 | + |
| 274 | +Port-forward the Grafana service to access the web interface: |
| 275 | + |
| 276 | +```bash |
| 277 | +kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80 |
| 278 | +``` |
| 279 | + |
| 280 | +### Get Admin Credentials |
| 281 | + |
| 282 | +Retrieve the admin password using kubectl: |
| 283 | + |
| 284 | +```bash |
| 285 | +kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo |
| 286 | +``` |
| 287 | + |
| 288 | +### Login Grafana |
| 289 | + |
| 290 | +1. Open your browser and navigate to [http://localhost:3000](http://localhost:3000) |
| 291 | +2. Login with: |
| 292 | + - **Username**: `admin` |
| 293 | + - **Password**: The password retrieved from the previous step |
| 294 | + |
| 295 | +### Explore Pre-configured Dashboards |
| 296 | + |
| 297 | +After logging in, navigate to **Dashboards** to explore the pre-configured Kubernetes monitoring dashboards: |
| 298 | + |
| 299 | +- **Kubernetes / Compute Resources / Cluster**: Overview of cluster-wide resource utilization |
| 300 | +- **Kubernetes / Compute Resources / Namespace (Pods)**: Resource usage breakdown by namespace |
| 301 | +- **Kubernetes / Compute Resources / Node (Pods)**: Node-level resource monitoring |
| 302 | +- **Node Exporter / Nodes**: Detailed node hardware and OS metrics |
| 303 | + |
| 304 | + |
| 305 | + |
| 306 | +## Conclusion |
| 307 | + |
| 308 | +You now have a complete Kubernetes monitoring solution with Prometheus collecting metrics and GreptimeDB providing efficient long-term storage. This setup enables you to: |
| 309 | + |
| 310 | +- Monitor cluster and application health in real-time |
| 311 | +- Store metrics for historical analysis and capacity planning |
| 312 | +- Create rich visualizations and dashboards with Grafana |
| 313 | +- Query metrics using both PromQL and SQL |
| 314 | + |
| 315 | +For more information about GreptimeDB and Prometheus integration, see: |
| 316 | + |
| 317 | +- [Prometheus Integration](/user-guide/ingest-data/for-observability/prometheus.md) |
| 318 | +- [Query Data in GreptimeDB](/user-guide/query-data/overview.md) |
| 319 | + |
0 commit comments