Skip to content

Commit 0c137e4

Browse files
authored
Add VictoriaMetrics switch guide for TiUP cluster (#20957) (#21142)
1 parent 76daf3f commit 0c137e4

File tree

1 file changed

+152
-8
lines changed

1 file changed

+152
-8
lines changed

maintain-tidb-using-tiup.md

Lines changed: 152 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,7 @@ summary: Learn the common operations to operate and maintain a TiDB cluster usin
55

66
# TiUP Common Operations
77

8-
This document describes the following common operations when you operate and maintain a TiDB cluster using TiUP.
9-
10-
- View the cluster list
11-
- Start the cluster
12-
- View the cluster status
13-
- Modify the configuration
14-
- Stop the cluster
15-
- Destroy the cluster
8+
This document describes the common operations when you operate and maintain a TiDB cluster using TiUP.
169

1710
## View the cluster list
1811

@@ -289,3 +282,154 @@ The destroy operation stops the services and clears the data directory and deplo
289282
```bash
290283
tiup cluster destroy ${cluster-name}
291284
```
285+
286+
## Switch from Prometheus to VictoriaMetrics
287+
288+
In large-scale clusters, Prometheus might encounter performance bottlenecks when handling a large number of instances. Starting from TiUP v1.16.3, TiUP supports switching the monitoring component from Prometheus to VictoriaMetrics (VM) to provide better scalability, higher performance, and lower resource consumption.
289+
290+
### Set up VictoriaMetrics for a new deployment
291+
292+
By default, TiUP uses Prometheus as the metrics monitoring component. To use VictoriaMetrics instead of Prometheus in a new deployment, configure the topology file as follows:
293+
294+
```yaml
295+
# Monitoring server configuration
296+
monitoring_servers:
297+
# IP address of the monitoring server
298+
- host: ip_address
299+
...
300+
prom_remote_write_to_vm: true
301+
enable_prom_agent_mode: true
302+
303+
# Grafana server configuration
304+
grafana_servers:
305+
# IP address of the Grafana server
306+
- host: ip_address
307+
...
308+
use_vm_as_datasource: true
309+
```
310+
311+
### Migrate an existing deployment to VictoriaMetrics
312+
313+
You can perform the migration without affecting running instances. Existing metrics will remain in Prometheus, while TiUP will write new metrics to VictoriaMetrics.
314+
315+
#### Enable VictoriaMetrics remote write
316+
317+
1. Edit the cluster configuration:
318+
319+
```bash
320+
tiup cluster edit-config ${cluster-name}
321+
```
322+
323+
2. Under `monitoring_servers`, set `prom_remote_write_to_vm` to `true`:
324+
325+
```yaml
326+
monitoring_servers:
327+
- host: ip_address
328+
...
329+
prom_remote_write_to_vm: true
330+
```
331+
332+
3. Reload the configuration to apply the changes:
333+
334+
```bash
335+
tiup cluster reload ${cluster-name} -R prometheus
336+
```
337+
338+
#### Switch the default data source to VictoriaMetrics
339+
340+
1. Edit the cluster configuration:
341+
342+
```bash
343+
tiup cluster edit-config ${cluster-name}
344+
```
345+
346+
2. Under `grafana_servers`, set `use_vm_as_datasource` to `true`:
347+
348+
```yaml
349+
grafana_servers:
350+
- host: ip_address
351+
...
352+
use_vm_as_datasource: true
353+
```
354+
355+
3. Reload the configuration to apply the changes:
356+
357+
```bash
358+
tiup cluster reload ${cluster-name} -R grafana
359+
```
360+
361+
#### View historical metrics generated before the switch (optional)
362+
363+
If you need to view historical metrics generated before the switch, switch the data source of Grafana as follows:
364+
365+
1. Edit the cluster configuration:
366+
367+
```bash
368+
tiup cluster edit-config ${cluster-name}
369+
```
370+
371+
2. Under `grafana_servers`, comment out `use_vm_as_datasource`:
372+
373+
```yaml
374+
grafana_servers:
375+
- host: ip_address
376+
...
377+
# use_vm_as_datasource: true
378+
```
379+
380+
3. Reload the configuration to apply the changes:
381+
382+
```bash
383+
tiup cluster reload ${cluster-name} -R grafana
384+
```
385+
386+
4. To switch back to VictoriaMetrics, repeat the steps in [Switch the default data source to VictoriaMetrics](#switch-the-default-data-source-to-victoriametrics).
387+
388+
### Clean up old metrics and services
389+
390+
After confirming that the old metrics have expired, you can perform the following steps to remove redundant services and files. This does not affect the running cluster.
391+
392+
#### Set Prometheus to agent mode
393+
394+
1. Edit the cluster configuration:
395+
396+
```bash
397+
tiup cluster edit-config ${cluster-name}
398+
```
399+
400+
2. Under `monitoring_servers`, set `enable_prom_agent_mode` to `true`, and ensure you also set `prom_remote_write_to_vm` and `use_vm_as_datasource` correctly:
401+
402+
```yaml
403+
monitoring_servers:
404+
- host: ip_address
405+
...
406+
prom_remote_write_to_vm: true
407+
enable_prom_agent_mode: true
408+
grafana_servers:
409+
- host: ip_address
410+
...
411+
use_vm_as_datasource: true
412+
```
413+
414+
3. Reload the configuration to apply the changes:
415+
416+
```bash
417+
tiup cluster reload ${cluster-name} -R prometheus
418+
```
419+
420+
#### Remove expired data directories
421+
422+
1. In the configuration file, locate the `data_dir` path of the monitoring server:
423+
424+
```yaml
425+
monitoring_servers:
426+
- host: ip_address
427+
...
428+
data_dir: "/tidb-data/prometheus-8249"
429+
```
430+
431+
2. Remove the data directory:
432+
433+
```bash
434+
rm -rf /tidb-data/prometheus-8249
435+
```

0 commit comments

Comments
 (0)