Skip to content

Commit 41a19f5

Browse files
authored
docs: monitor kubernetes (#2188)
1 parent 6e20883 commit 41a19f5

File tree

8 files changed

+1513
-0
lines changed

8 files changed

+1513
-0
lines changed
Lines changed: 319 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,319 @@
1+
---
2+
keywords: [Kubernetes, Prometheus, monitoring, metrics, observability, GreptimeDB, Prometheus Operator, Grafana]
3+
description: Guide to monitoring Kubernetes metrics using Prometheus with GreptimeDB as the storage backend, including architecture overview, installation, and visualization with Grafana.
4+
---
5+
6+
# Monitor Kubernetes Metrics with Prometheus and GreptimeDB
7+
8+
This guide demonstrates how to set up a complete Kubernetes monitoring solution using Prometheus for metrics collection and GreptimeDB as the long-term storage backend.
9+
10+
## What is Kubernetes Monitoring
11+
12+
Kubernetes monitoring is the practice of collecting, analyzing, and acting on metrics and logs from a Kubernetes cluster.
13+
It provides visibility into the health, performance, and resource utilization of your containerized applications and infrastructure.
14+
15+
Key aspects of Kubernetes monitoring include:
16+
17+
- **Resource Metrics**: CPU, memory, disk, and network usage for nodes, pods, and containers
18+
- **Cluster Health**: Status of cluster components like kube-apiserver, etcd, and controller-manager
19+
- **Application Metrics**: Custom metrics from your applications running in the cluster
20+
- **Events and Logs**: Kubernetes events and container logs for troubleshooting
21+
22+
Effective monitoring helps you:
23+
- Detect and diagnose issues before they impact users
24+
- Optimize resource utilization and reduce costs
25+
- Plan capacity based on historical trends
26+
- Ensure SLA compliance
27+
- Troubleshoot performance bottlenecks
28+
29+
## Architecture Overview
30+
31+
The monitoring architecture consists of the following components:
32+
33+
![Kubernetes Monitoring Architecture](/k8s-metrics-monitor-architecture.drawio.svg)
34+
35+
**Components:**
36+
37+
- **kube-state-metrics**: Exports cluster-level metrics about Kubernetes objects (deployments, pods, services, etc.)
38+
- **Node Exporter**: Exports hardware and OS-level metrics from each Kubernetes node
39+
- **Prometheus Operator**: Automates Prometheus deployment and configuration using Kubernetes custom resources
40+
- **GreptimeDB**: Acts as the long-term storage backend for Prometheus metrics with high compression and query performance
41+
- **Grafana**: Provides dashboards and visualizations for metrics stored in GreptimeDB
42+
43+
## Prerequisites
44+
45+
Before starting, ensure you have:
46+
47+
- A running Kubernetes cluster (version >= 1.18)
48+
- `kubectl` configured to access your cluster
49+
- [Helm](https://helm.sh/docs/intro/install/) v3.0.0 or higher installed
50+
- Sufficient cluster resources (at least 2 CPU cores and 4GB memory available)
51+
52+
## Install GreptimeDB
53+
54+
GreptimeDB serves as the long-term storage backend for Prometheus metrics.
55+
For detailed installation steps,
56+
please refer to the [Deploy GreptimeDB Cluster](/user-guide/deployments-administration/deploy-on-kubernetes/deploy-greptimedb-cluster.md) documentation.
57+
58+
### Verify the GreptimeDB Installation
59+
60+
After deploying GreptimeDB, verify that the cluster is running.
61+
In this guide we assume the GreptimeDB cluster is deployed in the `greptime-cluster` namespace and named `greptimedb`.
62+
63+
```bash
64+
kubectl -n greptime-cluster get greptimedbclusters.greptime.io greptimedb
65+
```
66+
67+
```bash
68+
NAME FRONTEND DATANODE META FLOWNODE PHASE VERSION AGE
69+
greptimedb 1 2 1 1 Running v0.17.2 33s
70+
```
71+
72+
Check the pods:
73+
74+
```bash
75+
kubectl get pods -n greptime-cluster
76+
```
77+
78+
```bash
79+
NAME READY STATUS RESTARTS AGE
80+
greptimedb-datanode-0 1/1 Running 0 71s
81+
greptimedb-datanode-1 1/1 Running 0 97s
82+
greptimedb-flownode-0 1/1 Running 0 64s
83+
greptimedb-frontend-8bf9f558c-7wdmk 1/1 Running 0 90s
84+
greptimedb-meta-fc4ddb78b-nv944 1/1 Running 0 87s
85+
```
86+
87+
### Access GreptimeDB
88+
89+
To interact with GreptimeDB directly, you can port-forward the frontend service to your local machine.
90+
GreptimeDB supports multiple protocols, with MySQL protocol available on port `4002` by default.
91+
92+
```bash
93+
kubectl port-forward -n greptime-cluster svc/greptimedb-frontend 4002:4002
94+
```
95+
96+
Connect using any MySQL-compatible client:
97+
98+
```bash
99+
mysql -h 127.0.0.1 -P 4002
100+
```
101+
102+
### Storage Partitioning
103+
104+
To improve query performance and reduce storage costs,
105+
GreptimeDB automatically creates columns based on Prometheus metric labels and stores metrics in a physical table.
106+
The default table name is `greptime_physical_table`.
107+
Since we deployed a GreptimeDB cluster with [multiple datanodes](#verify-the-greptimedb-installation),
108+
you can partition the table to distribute data across datanodes for better scalability and performance.
109+
110+
In this Kubernetes monitoring scenario, we can use the `namespace` label as the partition key.
111+
For example, with namespaces like `kube-public`, `kube-system`, `monitoring`, `default`, `greptime-cluster`, and `etcd-cluster`,
112+
you can create a partitioning scheme based on the first letter of the namespace:
113+
114+
```sql
115+
CREATE TABLE greptime_physical_table (
116+
greptime_value DOUBLE NULL,
117+
namespace STRING PRIMARY KEY,
118+
greptime_timestamp TIMESTAMP TIME INDEX,
119+
)
120+
PARTITION ON COLUMNS (namespace) (
121+
namespace < 'f',
122+
namespace >= 'f' AND namespace < 'g',
123+
namespace >= 'g' AND namespace < 'k',
124+
namespace >= 'k'
125+
)
126+
ENGINE = metric
127+
WITH (
128+
"physical_metric_table" = ""
129+
);
130+
```
131+
132+
For more information about Prometheus metrics storage and query performance optimization, refer to the [Improve efficiency by using metric engine](/user-guide/ingest-data/for-observability/prometheus.md#improve-efficiency-by-using-metric-engine) guide.
133+
134+
### Prometheus URLs in GreptimeDB
135+
136+
GreptimeDB provides [Prometheus-compatible APIs](/user-guide/query-data/promql.md#prometheus-http-api) under the HTTP context `/v1/prometheus/`,
137+
enabling seamless integration with existing Prometheus workflows.
138+
139+
To integrate Prometheus with GreptimeDB, you need the GreptimeDB service address.
140+
Since GreptimeDB runs inside the Kubernetes cluster, use the internal cluster address.
141+
142+
The GreptimeDB frontend service address follows this pattern:
143+
```
144+
<greptimedb-name>-frontend.<namespace>.svc.cluster.local:<port>
145+
```
146+
147+
In this guide:
148+
- GreptimeDB cluster name: `greptimedb`
149+
- Namespace: `greptime-cluster`
150+
- Frontend port: `4000`
151+
152+
So the service address is:
153+
```bash
154+
greptimedb-frontend.greptime-cluster.svc.cluster.local:4000
155+
```
156+
157+
The complete [Remote Write URL](/user-guide/ingest-data/for-observability/prometheus.md#remote-write-configuration) for Prometheus is:
158+
159+
```bash
160+
http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write
161+
```
162+
163+
This URL consists of:
164+
- **Service endpoint**: `greptimedb-frontend.greptime-cluster.svc.cluster.local:4000`
165+
- **API path**: `/v1/prometheus/write`
166+
167+
## Install Prometheus
168+
169+
Now that GreptimeDB is running, we'll install Prometheus to collect metrics and send them to GreptimeDB for long-term storage.
170+
171+
### Add the Prometheus Community Helm Repository
172+
173+
```bash
174+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
175+
helm repo update
176+
```
177+
178+
### Install the kube-prometheus-stack
179+
180+
The [`kube-prometheus-stack`](https://github.com/prometheus-operator/kube-prometheus) is a comprehensive monitoring solution that includes
181+
Prometheus, Grafana, kube-state-metrics, and node-exporter components.
182+
This stack automatically discovers and monitors all Kubernetes namespaces,
183+
collecting metrics from cluster components, nodes, and workloads.
184+
185+
In this deployment, we'll configure Prometheus to use GreptimeDB as the remote write destination for long-term metric storage and configure Grafana's default Prometheus data source to use GreptimeDB.
186+
187+
Create a `kube-prometheus-values.yaml` file with the following configuration:
188+
189+
```yaml
190+
# Configure Prometheus remote write to GreptimeDB
191+
prometheus:
192+
prometheusSpec:
193+
remoteWrite:
194+
- url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus/write
195+
196+
# Configure Grafana to use GreptimeDB as the default Prometheus data source
197+
grafana:
198+
datasources:
199+
datasources.yaml:
200+
apiVersion: 1
201+
datasources:
202+
- name: Prometheus
203+
type: prometheus
204+
url: http://greptimedb-frontend.greptime-cluster.svc.cluster.local:4000/v1/prometheus
205+
access: proxy
206+
editable: true
207+
```
208+
209+
This configuration file specifies [the GreptimeDB service address](#prometheus-urls-in-greptimedb) for:
210+
- **Prometheus remote write**: Sends collected metrics to GreptimeDB for long-term storage
211+
- **Grafana data source**: Configures GreptimeDB as the default Prometheus data source for dashboard queries
212+
213+
Install the `kube-prometheus-stack` using Helm with the custom values file:
214+
215+
```bash
216+
helm install kube-prometheus prometheus-community/kube-prometheus-stack \
217+
--namespace monitoring \
218+
--create-namespace \
219+
--values kube-prometheus-values.yaml
220+
```
221+
222+
### Verify the Installation
223+
224+
Check that all Prometheus components are running:
225+
226+
```bash
227+
kubectl get pods -n monitoring
228+
```
229+
230+
```bash
231+
NAME READY STATUS RESTARTS AGE
232+
alertmanager-kube-prometheus-kube-prome-alertmanager-0 2/2 Running 0 60s
233+
kube-prometheus-grafana-78ccf96696-sghx4 3/3 Running 0 78s
234+
kube-prometheus-kube-prome-operator-775fdbfd75-w88n7 1/1 Running 0 78s
235+
kube-prometheus-kube-state-metrics-5bd5747f46-d2sxs 1/1 Running 0 78s
236+
kube-prometheus-prometheus-node-exporter-ts9nn 1/1 Running 0 78s
237+
prometheus-kube-prometheus-kube-prome-prometheus-0 2/2 Running 0 60s
238+
```
239+
240+
### Verify the Monitoring Status
241+
242+
Use [MySQL protocol](#access-greptimedb) to query GreptimeDB and verify that Prometheus metrics are being written.
243+
244+
```sql
245+
SHOW TABLES;
246+
```
247+
248+
You should see tables created for various Prometheus metrics.
249+
250+
```sql
251+
+---------------------------------------------------------------------------------+
252+
| Tables |
253+
+---------------------------------------------------------------------------------+
254+
| :node_memory_MemAvailable_bytes:sum |
255+
| ALERTS |
256+
| ALERTS_FOR_STATE |
257+
| aggregator_discovery_aggregation_count_total |
258+
| aggregator_unavailable_apiservice |
259+
| alertmanager_alerts |
260+
| alertmanager_alerts_invalid_total |
261+
| alertmanager_alerts_received_total |
262+
| alertmanager_build_info |
263+
| ...... |
264+
+---------------------------------------------------------------------------------+
265+
1553 rows in set (0.18 sec)
266+
```
267+
268+
## Use Grafana for Visualization
269+
270+
Grafana is included in the kube-prometheus-stack and comes pre-configured with dashboards for comprehensive Kubernetes monitoring.
271+
272+
### Access Grafana
273+
274+
Port-forward the Grafana service to access the web interface:
275+
276+
```bash
277+
kubectl port-forward -n monitoring svc/kube-prometheus-grafana 3000:80
278+
```
279+
280+
### Get Admin Credentials
281+
282+
Retrieve the admin password using kubectl:
283+
284+
```bash
285+
kubectl get secret --namespace monitoring kube-prometheus-grafana -o jsonpath="{.data.admin-password}" | base64 --decode ; echo
286+
```
287+
288+
### Login Grafana
289+
290+
1. Open your browser and navigate to [http://localhost:3000](http://localhost:3000)
291+
2. Login with:
292+
- **Username**: `admin`
293+
- **Password**: The password retrieved from the previous step
294+
295+
### Explore Pre-configured Dashboards
296+
297+
After logging in, navigate to **Dashboards** to explore the pre-configured Kubernetes monitoring dashboards:
298+
299+
- **Kubernetes / Compute Resources / Cluster**: Overview of cluster-wide resource utilization
300+
- **Kubernetes / Compute Resources / Namespace (Pods)**: Resource usage breakdown by namespace
301+
- **Kubernetes / Compute Resources / Node (Pods)**: Node-level resource monitoring
302+
- **Node Exporter / Nodes**: Detailed node hardware and OS metrics
303+
304+
![Grafana Dashboard](/k8s-prom-monitor-grafana.jpg)
305+
306+
## Conclusion
307+
308+
You now have a complete Kubernetes monitoring solution with Prometheus collecting metrics and GreptimeDB providing efficient long-term storage. This setup enables you to:
309+
310+
- Monitor cluster and application health in real-time
311+
- Store metrics for historical analysis and capacity planning
312+
- Create rich visualizations and dashboards with Grafana
313+
- Query metrics using both PromQL and SQL
314+
315+
For more information about GreptimeDB and Prometheus integration, see:
316+
317+
- [Prometheus Integration](/user-guide/ingest-data/for-observability/prometheus.md)
318+
- [Query Data in GreptimeDB](/user-guide/query-data/overview.md)
319+

0 commit comments

Comments
 (0)