Skip to content

Commit 9da6fb9

Browse files
committed
Merge remote-tracking branch 'upstream/master'
2 parents 69ee308 + fe33d98 commit 9da6fb9

File tree

12 files changed

+212
-85
lines changed

12 files changed

+212
-85
lines changed

.gemini/config.yaml

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,4 @@ have_fun: false # Disables fun features like poems in PR summaries
66

77
code_review:
88
disable: false # Enables code reviews
9-
comment_severity_threshold: LOW ## Allows low severity comments such as grammar and spelling mistakes to be added to the PR
10-
help: false ## Disables the help message on pull request open
11-
summary: true ## Posts a pull request summary on the pull request open.
12-
code_review: true ## Posts a code review on pull request open.
9+
comment_severity_threshold: LOW ## Allows low severity comments such as grammar and spelling mistakes to be added to the PR

.gemini/styleguide.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,12 @@
22

33
## Behavior instruction
44

5-
You are acting as a **senior technical writer** reviewing TiDB documentation pull requests. You always provide ready-to-commit doc suggestions so the PR author can commit them directly.
5+
You are acting as a **senior technical writer** who is reviewing TiDB documentation pull requests and you always provide ready-to-commit doc suggestions so the PR author can commit them directly.
6+
7+
## Note
8+
9+
- When you finish the review, you directly add comments to the PR instead of requesting changes to avoid blocking the pull request from being merged.
10+
- If the PR author is ti-chi-bot, you only correct English grammar, spelling, and punctuation mistakes, if any.
611

712
## Review aspects
813

@@ -12,7 +17,7 @@ You are acting as a **senior technical writer** reviewing TiDB documentation pul
1217

1318
## General writing principles
1419

15-
- Correct English grammar, spelling, and punctuation mistakes if any.
20+
- Correct English grammar, spelling, and punctuation mistakes, if any.
1621
- Make sure the documentation is easy to understand for TiDB users.
1722
- Write in **second person** ("you") when addressing users.
1823
- Prefer **present tense** unless describing historical behavior.
@@ -41,7 +46,7 @@ You are acting as a **senior technical writer** reviewing TiDB documentation pul
4146

4247
- Inconsistent use of technical terms
4348

44-
_"cloud cluster" vs. "serverless cluster"_ – pick one.
49+
_"TiDB Cloud Serverless clusters" vs. "TiDB Serverless clusters"_ – pick one.
4550

4651
- Unclear step instructions
4752

@@ -56,3 +61,7 @@ You are acting as a **senior technical writer** reviewing TiDB documentation pul
5661
- Follow any existing terminology in our glossary (`/glossary.md` if available).
5762
- When in doubt, favor clarity over cleverness.
5863
- If something might confuse a new user, suggest a reword.
64+
65+
## Purposes of this style guide
66+
67+
This guide helps Gemini Code Assist provide actionable, high-quality suggestions for improving technical documentation, especially for PRs related to user guides, how-to articles, and product reference material.

TOC-tidb-cloud.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
- [Architecture](/tidb-cloud/tidb-cloud-intro.md#architecture)
88
- [High Availability](/tidb-cloud/high-availability-with-multi-az.md)
99
- [MySQL Compatibility](/mysql-compatibility.md)
10-
- [Roadmap](/tidb-cloud/tidb-cloud-roadmap.md)
1110
- Get Started
1211
- [Try Out TiDB Cloud](/tidb-cloud/tidb-cloud-quickstart.md)
1312
- [Try Out TiDB + AI](/vector-search/vector-search-get-started-using-python.md)

TOC.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -432,6 +432,7 @@
432432
- [Local Read Under Three Data Centers Deployment](/best-practices/three-dc-local-read.md)
433433
- [Use UUIDs](/best-practices/uuid.md)
434434
- [Read-Only Storage Nodes](/best-practices/readonly-nodes.md)
435+
- [SaaS Multi-Tenant Scenarios](/best-practices/saas-best-practices.md)
435436
- [Use Placement Rules](/configure-placement-rules.md)
436437
- [Use Load Base Split](/configure-load-base-split.md)
437438
- [Use Store Limit](/configure-store-limit.md)

_docHome.md

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,6 @@ Explore native support of Vector Search in TiDB Cloud Serverless to build your A
3838

3939
</DocHomeCard>
4040

41-
<DocHomeCard href="/tidbcloud/tidb-cloud-roadmap" label="TiDB Cloud Roadmap" icon="cloud-roadmap-mauve">
42-
43-
Planned features and releases for TiDB Cloud.
44-
45-
</DocHomeCard>
46-
4741
</DocHomeCardContainer>
4842

4943
</DocHomeSection>
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Best Practices for SaaS Multi-Tenant Scenarios
3+
summary: Learn best practices for TiDB in SaaS (Software as a Service) multi-tenant scenarios, especially for environments where the number of tables in a single cluster exceeds one million.
4+
---
5+
6+
# Best Practices for SaaS Multi-Tenant Scenarios
7+
8+
This document introduces best practices for TiDB in SaaS (Software as a Service) multi-tenant environments, especially in scenarios where the **number of tables in a single cluster exceeds one million**. By making reasonable configurations and choices, you can enable TiDB to run efficiently and stably in SaaS scenarios while reducing resource consumption and costs.
9+
10+
> **Note:**
11+
>
12+
> It is recommended to use TiDB v8.5.0 or later versions.
13+
14+
## TiDB hardware recommendations
15+
16+
It is recommended to use high-memory TiDB instances. For example:
17+
18+
- For one million tables, use 32 GiB or more memory.
19+
- For three million tables, use 64 GiB or more memory.
20+
21+
High-memory TiDB instances allocate more cache space for Infoschema, Statistics, and execution plan caches, thereby improving cache hit rates and consequently enhancing business performance. Larger memory also mitigates performance fluctuations and stability issues caused by TiDB GC.
22+
23+
Recommended hardware configurations for TiKV and PD are as follows:
24+
25+
* TiKV: 8 vCPUs and 32 GiB or more memory.
26+
* PD: 8 CPUs and 16 GiB or more memory.
27+
28+
## Control the number of Regions
29+
30+
If you need to create a large number of tables (for example, more than 100,000), it is recommended to set the TiDB configuration item [`split-table`](/tidb-configuration-file.md#split-table) to `false` to reduce the number of Regions, thus alleviating memory pressure on TiKV.
31+
32+
## Configure caches
33+
34+
* Starting from TiDB v8.4.0, TiDB loads table information involved in SQL statements into the Infoschema cache on demand during SQL execution.
35+
36+
- You can monitor the size and hit rate of the Infoschema cache by observing the **Infoschema v2 Cache Size** and **Infoschema v2 Cache Operation** sub-panels under the **Schema Load** panel in TiDB Dashboard.
37+
- You can use the [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) system variable to adjust the memory limit of the Infoschema cache to meet business needs. The size of the Infoschema cache is linearly related to the number of different tables involved in SQL execution. In actual tests, fully caching metadata for one million tables (each with four columns, one primary key, and one index) requires about 2.4 GiB of memory.
38+
39+
* TiDB loads table statistics involved in SQL statements into the Statistics cache on demand during SQL execution.
40+
41+
- You can monitor the size and hit rate of the Statistics cache by observing the **Stats Cache Cost** and **Stats Cache OPS** sub-panels under the **Statistics & Plan Management** panel in TiDB Dashboard.
42+
- You can use the [`tidb_stats_cache_mem_quota`](/system-variables.md#tidb_stats_cache_mem_quota-new-in-v610) system variable to adjust the memory limit of the Statistics cache to meet business needs. In actual tests, executing simple SQL (using the `IndexRangeScan` operator) on 100,000 tables consumes about 3.96 GiB of memory in the Statistics cache.
43+
44+
## Collect statistics
45+
46+
* Starting from TiDB v8.4.0, TiDB introduces the [`tidb_auto_analyze_concurrency`](/system-variables.md#tidb_auto_analyze_concurrency-new-in-v840) system variable to control the number of concurrent auto-analyze operations that can run in a TiDB cluster. In multi-table scenarios, you can increase this concurrency as needed to improve the throughput of automatic analysis. As the concurrency value increases, the throughput and the CPU usage of the TiDB Owner node increase linearly. In actual tests, using a concurrency value of 16 allows automatic analysis of 320 tables (each with 10,000 rows, 4 columns, and 1 index) within one minute, consuming one CPU core of the TiDB Owner node.
47+
* The [`tidb_auto_build_stats_concurrency`](/system-variables.md#tidb_auto_build_stats_concurrency-new-in-v650) and [`tidb_build_sampling_stats_concurrency`](/system-variables.md#tidb_build_sampling_stats_concurrency-new-in-v750) system variables control the concurrency of TiDB statistics construction. You can adjust them based on your scenario:
48+
- For scenarios with many partitioned tables, prioritize increasing the value of `tidb_auto_build_stats_concurrency`.
49+
- For scenarios with many columns, prioritize increasing the value of `tidb_build_sampling_stats_concurrency`.
50+
* To avoid excessive resource usage, ensure that the product of `tidb_auto_analyze_concurrency`, `tidb_auto_build_stats_concurrency`, and `tidb_build_sampling_stats_concurrency` does not exceed the number of TiDB CPU cores.
51+
52+
## Query system tables efficiently
53+
54+
When querying system tables, it is recommended to add filters such as `TABLE_SCHEMA`, `TABLE_NAME`, or `TIDB_TABLE_ID` to avoid scanning a large amount of irrelevant data. This improves query speed and reduces resource consumption.
55+
56+
For example, in a scenario with three million tables:
57+
58+
- Executing the following SQL statement consumes about 8 GiB of memory.
59+
60+
```sql
61+
SELECT COUNT(*) FROM information_schema.tables;
62+
```
63+
64+
- Executing the following SQL statement takes about 20 minutes.
65+
66+
```sql
67+
SELECT COUNT(*) FROM information_schema.views;
68+
```
69+
70+
By adding appropriate filter conditions to the preceding SQL statements, memory consumption becomes negligible, and query time is reduced to milliseconds.
71+
72+
## Handle connection-intensive scenarios
73+
74+
In SaaS multi-tenant scenarios, each user usually connects to TiDB to operate data in their own tenant (database). To support a high number of connections:
75+
76+
* Increase the TiDB configuration item [`token-limit`](/tidb-configuration-file.md#token-limit) (`1000` by default) to support more concurrent requests.
77+
* The memory usage of TiDB is roughly linear with the number of connections. In actual tests, 200,000 idle connections increase TiDB memory usage by about 30 GiB. It is recommended to increase TiDB memory specifications based on actual connection numbers.
78+
* If you use `PREPARED` statements, each connection maintains a session-level Prepared Plan Cache. If the `DEALLOCATE` statement is not executed for a long time, the cache might accumulate too many plans, increasing memory usage. In actual tests, 400,000 execution plans involving `IndexRangeScan` consume approximately 5 GiB of memory. It is recommended to increase memory specifications accordingly.
79+
80+
## Use stale read carefully
81+
82+
When you use [Stale Read](/stale-read.md), an outdated schema version might trigger a full load of historical schemas, which can significantly impact performance. To mitigate this issue, increase the value of [`tidb_schema_version_cache_limit`](/system-variables.md#tidb_schema_version_cache_limit-new-in-v740) (for example, to `255`).
83+
84+
## Optimize BR backup and restore
85+
86+
* When restoring a full backup with millions of tables, it is recommended to use high-memory BR instances. For example:
87+
- For one million tables, use BR instances with 32 GiB or more memory.
88+
- For three million tables, use BR instances with 64 GiB or more memory.
89+
* BR log backup and snapshot restore consume additional TiKV memory. It is recommended to use TiKV instances with 32 GiB or more memory.
90+
* Adjust BR configurations [`pitr-batch-count` and `pitr-concurrency`](/br/use-br-command-line-tool.md#common-options) as needed to improve log restore speed.
91+
92+
## Import data with TiDB Lightning
93+
94+
When importing millions of tables using [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md), follow these recommendations:
95+
96+
- For large tables (over 100 GiB), use TiDB Lightning [physical import mode](/tidb-lightning/tidb-lightning-physical-import-mode.md).
97+
- For small tables (typically numerous in quantity), use TiDB Lightning [logical import mode](/tidb-lightning/tidb-lightning-logical-import-mode.md).

character-set-and-collation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,7 @@ SHOW CHARACTER SET;
104104
+---------+-------------------------------------+-------------------+--------+
105105
| ascii | US ASCII | ascii_bin | 1 |
106106
| binary | binary | binary | 1 |
107-
| gbk | Chinese Internal Code Specification | gbk_bin | 2 |
107+
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
108108
| latin1 | Latin1 | latin1_bin | 1 |
109109
| utf8 | UTF-8 Unicode | utf8_bin | 3 |
110110
| utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |

character-set-gbk.md

Lines changed: 10 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,9 @@ summary: This document provides details about the TiDB support of the GBK charac
55

66
# GBK
77

8-
Since v5.4.0, TiDB supports the GBK character set. This document provides the TiDB support and compatibility information of the GBK character set.
8+
Starting from v5.4.0, TiDB supports the GBK character set. This document provides the TiDB support and compatibility information of the GBK character set.
9+
10+
Starting from v6.0.0, TiDB enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) by default. The default collation for TiDB GBK character set is `gbk_chinese_ci`, which is consistent with MySQL.
911

1012
```sql
1113
SHOW CHARACTER SET WHERE CHARSET = 'gbk';
@@ -15,7 +17,7 @@ SHOW CHARACTER SET WHERE CHARSET = 'gbk';
1517
+---------+-------------------------------------+-------------------+--------+
1618
| Charset | Description | Default collation | Maxlen |
1719
+---------+-------------------------------------+-------------------+--------+
18-
| gbk | Chinese Internal Code Specification | gbk_bin | 2 |
20+
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
1921
+---------+-------------------------------------+-------------------+--------+
2022
1 row in set (0.00 sec)
2123
```
@@ -40,48 +42,22 @@ This section provides the compatibility information between MySQL and TiDB.
4042

4143
### Collations
4244

43-
The default collation of the GBK character set in MySQL is `gbk_chinese_ci`. Unlike MySQL, the default collation of the GBK character set in TiDB is `gbk_bin`. Additionally, because TiDB converts GBK to `utf8mb4` and then uses a binary collation, the `gbk_bin` collation in TiDB is not the same as the `gbk_bin` collation in MySQL.
44-
4545
<CustomContent platform="tidb">
4646

47-
To make TiDB compatible with the collations of MySQL GBK character set, when you first initialize the TiDB cluster, you need to set the TiDB option [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) to `true` to enable the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations). This is the default setting for new deployments.
47+
The default collation of the GBK character set in MySQL is `gbk_chinese_ci`. The default collation for the GBK character set in TiDB depends on the value of the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap):
48+
49+
- By default, the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) is set to `true`, which means that the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) is enabled and the default collation for the GBK character set is `gbk_chinese_ci`.
50+
- When the TiDB configuration item [`new_collations_enabled_on_first_bootstrap`](/tidb-configuration-file.md#new_collations_enabled_on_first_bootstrap) is set to `false`, the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) is disabled, and the default collation for the GBK character set is `gbk_bin`.
4851

4952
</CustomContent>
5053

5154
<CustomContent platform="tidb-cloud">
5255

53-
To make TiDB compatible with the collations of MySQL GBK character set, when you first initialize the TiDB cluster, TiDB Cloud enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) by default.
56+
By default, TiDB Cloud enables the [new framework for collations](/character-set-and-collation.md#new-framework-for-collations) and the default collation for the GBK character set is `gbk_chinese_ci`.
5457

5558
</CustomContent>
5659

57-
After enabling the new framework for collations, if you check the collations corresponding to the GBK character set, you can see that the TiDB GBK default collation is changed to `gbk_chinese_ci`.
58-
59-
```sql
60-
SHOW CHARACTER SET WHERE CHARSET = 'gbk';
61-
```
62-
63-
```
64-
+---------+-------------------------------------+-------------------+--------+
65-
| Charset | Description | Default collation | Maxlen |
66-
+---------+-------------------------------------+-------------------+--------+
67-
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
68-
+---------+-------------------------------------+-------------------+--------+
69-
1 row in set (0.00 sec)
70-
```
71-
72-
```sql
73-
SHOW COLLATION WHERE CHARSET = 'gbk';
74-
```
75-
76-
```
77-
+----------------+---------+----+---------+----------+---------+---------------+
78-
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
79-
+----------------+---------+----+---------+----------+---------+---------------+
80-
| gbk_bin | gbk | 87 | | Yes | 1 | PAD SPACE |
81-
| gbk_chinese_ci | gbk | 28 | Yes | Yes | 1 | PAD SPACE |
82-
+----------------+---------+----+---------+----------+---------+---------------+
83-
2 rows in set (0.00 sec)
84-
```
60+
Additionally, because TiDB converts GBK to `utf8mb4` and then uses a binary collation, the `gbk_bin` collation in TiDB is not the same as the `gbk_bin` collation in MySQL.
8561

8662
### Illegal character compatibility
8763

sql-statements/sql-statement-show-character-set.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,16 +26,17 @@ SHOW CHARACTER SET;
2626
```
2727

2828
```
29-
+---------+---------------+-------------------+--------+
30-
| Charset | Description | Default collation | Maxlen |
31-
+---------+---------------+-------------------+--------+
32-
| utf8 | UTF-8 Unicode | utf8_bin | 3 |
33-
| utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |
34-
| ascii | US ASCII | ascii_bin | 1 |
35-
| latin1 | Latin1 | latin1_bin | 1 |
36-
| binary | binary | binary | 1 |
37-
+---------+---------------+-------------------+--------+
38-
5 rows in set (0.00 sec)
29+
+---------+-------------------------------------+-------------------+--------+
30+
| Charset | Description | Default collation | Maxlen |
31+
+---------+-------------------------------------+-------------------+--------+
32+
| ascii | US ASCII | ascii_bin | 1 |
33+
| binary | binary | binary | 1 |
34+
| gbk | Chinese Internal Code Specification | gbk_chinese_ci | 2 |
35+
| latin1 | Latin1 | latin1_bin | 1 |
36+
| utf8 | UTF-8 Unicode | utf8_bin | 3 |
37+
| utf8mb4 | UTF-8 Unicode | utf8mb4_bin | 4 |
38+
+---------+-------------------------------------+-------------------+--------+
39+
6 rows in set (0.00 sec)
3940
```
4041

4142
```sql

tidb-cloud/tidb-cloud-roadmap.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ summary: Learn about TiDB Cloud's roadmap for the next few months. See the new f
55

66
# TiDB Cloud Roadmap
77

8+
> **Warning:**
9+
>
10+
> This roadmap might contain outdated information. We are working on updating it to reflect the latest product plans and development priorities.
11+
812
The TiDB Cloud roadmap brings you what's coming in the near future, so you can see the new features or improvements in advance, follow the progress, and learn about the key milestones on the way. In the course of development, this roadmap is subject to change based on user needs, feedback, and our assessment.
913

1014
✅: The feature or improvement is already available in TiDB Cloud.

0 commit comments

Comments
 (0)