From 6d8d778b6eb3ac1e537a7a4c95cbc16fea9835e5 Mon Sep 17 00:00:00 2001 From: xixirangrang Date: Thu, 23 Oct 2025 11:02:22 +0800 Subject: [PATCH] This is an automated cherry-pick of #21928 Signed-off-by: ti-chi-bot --- releases/release-6.5.0.md | 2 +- releases/release-7.6.0.md | 4 +- releases/release-8.3.0.md | 436 +++++++++++++++++++++++ system-variables.md | 4 +- ticdc/ticdc-bidirectional-replication.md | 34 +- ticdc/ticdc-changefeed-config.md | 436 +++++++++++++++++++++++ tidb-cloud/recovery-group-overview.md | 2 +- 7 files changed, 897 insertions(+), 21 deletions(-) create mode 100644 releases/release-8.3.0.md diff --git a/releases/release-6.5.0.md b/releases/release-6.5.0.md index 02c501eae4c25..8c9b50259a1ac 100644 --- a/releases/release-6.5.0.md +++ b/releases/release-6.5.0.md @@ -326,7 +326,7 @@ Compared with TiDB [6.4.0-DMR](/releases/release-6.4.0.md), TiDB 6.5.0 introduce | [`tidb_cdc_write_source`](/system-variables.md#tidb_cdc_write_source-new-in-v650) | Newly added | When this variable is set to a value other than 0, data written in this session is considered to be written by TiCDC. This variable can only be modified by TiCDC. Do not manually modify this variable in any case. | | [`tidb_enable_plan_replayer_capture`](/system-variables.md#tidb_enable_plan_replayer_capture) | Newly added | The feature controlled by this variable is not fully functional in TiDB v6.5.0. Do not change the default value. | | [`tidb_index_merge_intersection_concurrency`](/system-variables.md#tidb_index_merge_intersection_concurrency-new-in-v650) | Newly added | Sets the maximum concurrency for the intersection operations that index merge performs. It is effective only when TiDB accesses partitioned tables in the dynamic pruning mode. | -| [`tidb_source_id`](/system-variables.md#tidb_source_id-new-in-v650) | Newly added | This variable is used to configure the different cluster IDs in a [bi-directional replication](/ticdc/ticdc-bidirectional-replication.md) cluster.| +| [`tidb_source_id`](/system-variables.md#tidb_source_id-new-in-v650) | Newly added | This variable is used to configure the different cluster IDs in a [bidirectional replication](/ticdc/ticdc-bidirectional-replication.md) cluster.| | [`tidb_sysproc_scan_concurrency`](/system-variables.md#tidb_sysproc_scan_concurrency-new-in-v650) | Newly added | This variable is used to set the concurrency of scan operations performed when TiDB executes internal SQL statements (such as an automatic update of statistics). The default value is `1`. | | [`tidb_ttl_delete_batch_size`](/system-variables.md#tidb_ttl_delete_batch_size-new-in-v650) | Newly added | This variable is used to set the maximum number of rows that can be deleted in a single `DELETE` transaction in a TTL job. | | [`tidb_ttl_delete_rate_limit`](/system-variables.md#tidb_ttl_delete_rate_limit-new-in-v650) | Newly added | This variable is used to limit the maximum number of `DELETE` statements allowed per second in a single node in a TTL job. When this variable is set to `0`, no limit is applied. | diff --git a/releases/release-7.6.0.md b/releases/release-7.6.0.md index a1b14bd97b533..60eae0d3b66e3 100644 --- a/releases/release-7.6.0.md +++ b/releases/release-7.6.0.md @@ -234,9 +234,9 @@ Quick access: [Quick start](https://docs.pingcap.com/tidb/v7.6/quick-start-with- For more information, see [documentation](/dm/dm-compatibility-catalog.md). -* TiCDC supports replicating DDL statements in bi-directional replication (BDR) mode (experimental) [#10301](https://github.com/pingcap/tiflow/issues/10301) [#48519](https://github.com/pingcap/tidb/issues/48519) @[okJiang](https://github.com/okJiang) @[asddongmen](https://github.com/asddongmen) +* TiCDC supports replicating DDL statements in bidirectional replication (BDR) mode (experimental) [#10301](https://github.com/pingcap/tiflow/issues/10301) [#48519](https://github.com/pingcap/tidb/issues/48519) @[okJiang](https://github.com/okJiang) @[asddongmen](https://github.com/asddongmen) - Starting from v7.6.0, TiCDC supports replication of DDL statements with bi-directional replication configured. Previously, replicating DDL statements was not supported by TiCDC, so users of TiCDC's bi-directional replication had to apply DDL statements to both TiDB clusters separately. With this feature, TiCDC allows for a cluster to be assigned the `PRIMARY` BDR role, and enables the replication of DDL statements from that cluster to the downstream cluster. + Starting from v7.6.0, TiCDC supports replication of DDL statements with bidirectional replication configured. Previously, replicating DDL statements was not supported by TiCDC, so users of TiCDC's bidirectional replication had to apply DDL statements to both TiDB clusters separately. With this feature, TiCDC allows for a cluster to be assigned the `PRIMARY` BDR role, and enables the replication of DDL statements from that cluster to the downstream cluster. For more information, see [documentation](/ticdc/ticdc-bidirectional-replication.md). diff --git a/releases/release-8.3.0.md b/releases/release-8.3.0.md new file mode 100644 index 0000000000000..fe1154db4a85e --- /dev/null +++ b/releases/release-8.3.0.md @@ -0,0 +1,436 @@ +--- +title: TiDB 8.3.0 Release Notes +summary: Learn about the new features, compatibility changes, improvements, and bug fixes in TiDB 8.3.0. +--- + +# TiDB 8.3.0 Release Notes + +Release date: August 22, 2024 + +TiDB version: 8.3.0 + +Quick access: [Quick start](https://docs.pingcap.com/tidb/v8.3/quick-start-with-tidb) + +8.3.0 introduces the following key features and improvements: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CategoryFeature/EnhancementDescription
Scalability and Performance Global indexes for partitioned tables (experimental)Global indexes can effectively improve the efficiency of retrieving non-partitioned columns, and remove the restriction that a unique key must contain the partition key. This feature extends the usage scenarios of TiDB partitioned tables and avoids some of the application modification work that might be required for data migration.
Default pushdown of the Projection operator to the storage enginePushing the Projection operator down to the storage engine can distribute the load across storage nodes while reducing data transfer between nodes. This optimization helps to reduce the execution time for certain SQL queries and improves the overall database performance.
Ignoring unnecessary columns when collecting statisticsUnder the premise of ensuring that the optimizer can obtain the necessary information, TiDB speeds up statistics collection, improves the timeliness of statistics, and thus ensures that the optimal execution plan is selected, improving the performance of the cluster. Meanwhile, TiDB also reduces the system overhead and improves the resource utilization.
Reliability and AvailabilityBuilt-in virtual IP management in TiProxyTiProxy introduces built-in virtual IP management. When configured, it supports automatic virtual IP switching without relying on external platforms or tools. This feature simplifies TiProxy deployment and reduces the complexity of the database access layer.
+ +## Feature details + +### Performance + +* The optimizer allows pushing the `Projection` operator down to the storage engine by default [#51876](https://github.com/pingcap/tidb/issues/51876) @[yibin87](https://github.com/yibin87) + + Pushing the `Projection` operator down to the storage engine reduces data transfer between the compute engine and the storage engine, thereby improving SQL execution performance. This is particularly effective for queries containing [JSON query functions](/functions-and-operators/json-functions/json-functions-search.md) or [JSON value attribute functions](/functions-and-operators/json-functions/json-functions-return.md). Starting from v8.3.0, TiDB enables the `Projection` operator pushdown feature by default, by changing the default value of the system variable controlling this feature, [`tidb_opt_projection_push_down`](/system-variables.md#tidb_opt_projection_push_down-new-in-v610), from `OFF` to `ON`. When this feature is enabled, the optimizer automatically pushes eligible JSON query functions and JSON value attribute functions down to the storage engine. + + For more information, see [documentation](/system-variables.md#tidb_opt_projection_push_down-new-in-v610). + +* Optimize batch processing strategy for KV (key-value) requests [#55206](https://github.com/pingcap/tidb/issues/55206) @[zyguan](https://github.com/zyguan) + + TiDB fetches data by sending KV requests to TiKV. Batching and processing KV requests in bulk can significantly improve execution performance. Before v8.3.0, the batching strategy in TiDB is less efficient. Starting from v8.3.0, TiDB introduces several more efficient batching strategies in addition to the existing one. You can configure different batching strategies using the [`tikv-client.batch-policy`](/tidb-configuration-file.md#batch-policy-new-in-v830) configuration item to accommodate various workloads. + + For more information, see [documentation](/tidb-configuration-file.md#batch-policy-new-in-v830). + +* TiFlash introduces HashAgg aggregation calculation modes to improve the performance for high NDV data [#9196](https://github.com/pingcap/tiflash/issues/9196) @[guo-shaoge](https://github.com/guo-shaoge) + + Before v8.3.0, TiFlash has low aggregation calculation efficiency during the first stage of HashAgg aggregation when handling data with high NDV (number of distinct values). Starting from v8.3.0, TiFlash introduces multiple HashAgg aggregation calculation modes to improve the aggregation performance for different data characteristics. To choose a desired HashAgg aggregation calculation mode, you can configure the [`tiflash_hashagg_preaggregation_mode`](/system-variables.md#tiflash_hashagg_preaggregation_mode-new-in-v830) system variable. + + For more information, see [documentation](/system-variables.md#tiflash_hashagg_preaggregation_mode-new-in-v830). + +* Ignore unnecessary columns when collecting statistics [#53567](https://github.com/pingcap/tidb/issues/53567) @[hi-rustin](https://github.com/Rustin170506) + + When the optimizer generates an execution plan, it only needs statistics for some columns, such as columns in the filter conditions, columns in the join keys, and columns used for aggregation. Starting from v8.3.0, TiDB continuously observes the historical records of the columns used in SQL statements. By default, TiDB only collects statistics for columns with indexes and columns that are observed to require statistics collection. This speeds up the collection of statistics and avoids unnecessary resource consumption. + + When you upgrade your cluster from a version earlier than v8.3.0 to v8.3.0 or later, TiDB retains the original behavior by default, that is, collecting statistics for all columns. To enable this feature, you need to manually set the system variable [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830) to `PREDICATE`. For newly deployed clusters, this feature is enabled by default. + + For analytical systems with many random queries, you can set the system variable [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830) to `ALL` to collect statistics for all columns, to ensure the performance of random queries. For other types of systems, it is recommended to keep the default setting (`PREDICATE`) of [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830) to collect statistics for only necessary columns. + + For more information, see [documentation](/statistics.md#collect-statistics-on-some-columns). + +* Improve the query performance of some system tables [#50305](https://github.com/pingcap/tidb/issues/50305) @[tangenta](https://github.com/tangenta) + + In previous versions, querying system tables has poor performance when the cluster size becomes large and there are a large number of tables. + + In v8.0.0, query performance is optimized for the following four system tables: + + - `INFORMATION_SCHEMA.TABLES` + - `INFORMATION_SCHEMA.STATISTICS` + - `INFORMATION_SCHEMA.KEY_COLUMN_USAGE` + - `INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS` + + In v8.3.0, the query performance is optimized for the following system tables, bringing a multi-fold performance improvement compared to v8.2.0. + + - `INFORMATION_SCHEMA.CHECK_CONSTRAINTS` + - `INFORMATION_SCHEMA.COLUMNS` + - `INFORMATION_SCHEMA.PARTITIONS` + - `INFORMATION_SCHEMA.SCHEMATA` + - `INFORMATION_SCHEMA.SEQUENCES` + - `INFORMATION_SCHEMA.TABLE_CONSTRAINTS` + - `INFORMATION_SCHEMA.TIDB_CHECK_CONSTRAINTS` + - `INFORMATION_SCHEMA.TiDB_INDEXES` + - `INFORMATION_SCHEMA.TIDB_INDEX_USAGE` + - `INFORMATION_SCHEMA.VIEWS` + +* Support partition pruning when partition expressions use the `EXTRACT(YEAR_MONTH...)` function to improve query performance [#54209](https://github.com/pingcap/tidb/pull/54209) @[mjonss](https://github.com/mjonss) + + In previous versions, when partition expressions use the `EXTRACT(YEAR_MONTH...)` function, partition pruning is not supported, resulting in poor query performance. Starting from v8.3.0, partition pruning is supported when partition expressions use the `EXTRACT(YEAR_MONTH...)` function, which improves query performance. + + For more information, see [documentation](/partition-pruning.md#scenario-three). + +* Improve the performance of `CREATE TABLE` by 1.4 times, `CREATE DATABASE` by 2.1 times, and `ADD COLUMN` by 2 times [#54436](https://github.com/pingcap/tidb/issues/54436) @[D3Hunter](https://github.com/D3Hunter) + + TiDB v8.0.0 introduces the system variable [`tidb_enable_fast_create_table`](/system-variables.md#tidb_enable_fast_create_table-new-in-v800) to improve table creation performance in batch table creation scenarios. In v8.3.0, when submitting the DDL statements for table creation concurrently through 10 sessions in a single database, the performance is improved by 1.4 times compared with v8.2.0. + + In v8.3.0, the performance of general DDLs in batch execution has improved compared to v8.2.0. The performance of `CREATE DATABASE` for 10 sessions concurrently improves by 19 times compared with v8.1.0 and 2.1 times compared with v8.2.0. The performance of using 10 sessions to add columns (`ADD COLUMN`) to multiple tables in the same database in batch has improved by 10 times compared with v8.1.0, and 2.1 times compared with v8.2.0. The performance of `ADD COLUMN` with 10 sessions on multiple tables in the same database has improved by 10 times compared with v8.1.0 and 2 times compared with v8.2.0. + + For more information, see [documentation](/system-variables.md#tidb_enable_fast_create_table-new-in-v800). + +* Partitioned tables support global indexes (experimental) [#45133](https://github.com/pingcap/tidb/issues/45133) @[mjonss](https://github.com/mjonss) @[Defined2014](https://github.com/Defined2014) @[jiyfhust](https://github.com/jiyfhust) @[L-maple](https://github.com/L-maple) + + In previous versions of partitioned tables, some limitations exist because global indexes are not supported. For example, the unique key must use every column in the table's partitioning expression. If the query condition does not use the partitioning key, the query will scan all partitions, resulting in poor performance. Starting from v7.6.0, the system variable [`tidb_enable_global_index`](/system-variables.md#tidb_enable_global_index-new-in-v760) is introduced to enable the global index feature. But this feature was under development at that time and it is not recommended to enable it. + + Starting with v8.3.0, the global index feature is released as an experimental feature. You can explicitly create a global index for a partitioned table with the keyword `Global` to remove the restriction that the unique key must use every column in the table's partitioning expression, to meet flexible business needs. Global indexes also enhance the performance of queries that do not include partition keys. + + For more information, see [documentation](/partitioned-table.md#global-indexes). + +### Reliability + +* Support streaming cursor result sets (experimental) [#54526](https://github.com/pingcap/tidb/issues/54526) @[YangKeao](https://github.com/YangKeao) + + When the application code retrieves the result set using [Cursor Fetch](/develop/dev-guide-connection-parameters.md#use-streamingresult-to-get-the-execution-result), TiDB usually first stores the complete result set in memory, and then returns the data to the client in batches. If the result set is too large, TiDB might temporarily write the result to the hard disk. + + Starting from v8.3.0, if you set the system variable [`tidb_enable_lazy_cursor_fetch`](/system-variables.md#tidb_enable_lazy_cursor_fetch-new-in-v830) to `ON`, TiDB no longer reads all data to the TiDB node, but gradually reads data to the TiDB node as the client reads. When TiDB processes large result sets, this feature reduces the memory usage of the TiDB node and improves the stability of the cluster. + + For more information, see [documentation](/system-variables.md#tidb_enable_lazy_cursor_fetch-new-in-v830). + +* Enhance SQL execution plan binding [#55280](https://github.com/pingcap/tidb/issues/55280) [#55343](https://github.com/pingcap/tidb/issues/55343) @[time-and-fate](https://github.com/time-and-fate) + + In OLTP scenarios, the optimal execution plan for most SQL statements is fixed. Implementing SQL execution plan binding for important SQL statements in the application can reduce the probability of the execution plan becoming worse and improve system stability. To meet the requirements of creating a large number of SQL execution plan bindings, TiDB enhances the capability and experience of SQL binding, including: + + - Use a single SQL statement to create SQL execution plan bindings from multiple historical execution plans to improve the efficiency of creating bindings. + - The SQL execution plan binding supports more optimizer hints, and optimizes the conversion method for complex execution plans, making the binding more stable in restoring the execution plan. + + For more information, see [documentation](/sql-plan-management.md). + +### Availability + +* TiProxy supports built-in virtual IP management [#583](https://github.com/pingcap/tiproxy/issues/583) @[djshow832](https://github.com/djshow832) + + Before v8.3.0, when using primary-secondary mode for high availability, TiProxy requires an additional component to manage the virtual IP address. Starting from v8.3.0, TiProxy supports built-in virtual IP management. In primary-secondary mode, when a primary node fails over, the new primary node will automatically bind to the specified virtual IP, ensuring that clients can always connect to an available TiProxy through the virtual IP. + + To enable virtual IP management, specify the virtual IP address using the TiProxy configuration item [`ha.virtual-ip`](/tiproxy/tiproxy-configuration.md#virtual-ip) and specify the network interface to bind the virtual IP to using [`ha.interface`](/tiproxy/tiproxy-configuration.md#interface). The virtual IP will be bound to a TiProxy instance only when both of these configuration items are set. + + For more information, see [documentation](/tiproxy/tiproxy-overview.md). + +### SQL + +* Support upgrading `SELECT LOCK IN SHARE MODE` to exclusive locks [#54999](https://github.com/pingcap/tidb/issues/54999) @[cfzjywxk](https://github.com/cfzjywxk) + + TiDB does not support `SELECT LOCK IN SHARE MODE` yet. Starting from v8.3.0, TiDB supports upgrading `SELECT LOCK IN SHARE MODE` to exclusive locks to enable support for `SELECT LOCK IN SHARE MODE`. You can control whether to enable this feature by using the new system variable [`tidb_enable_shared_lock_promotion`](/system-variables.md#tidb_enable_shared_lock_promotion-new-in-v830). + + For more information, see [documentation](/system-variables.md#tidb_enable_shared_lock_promotion-new-in-v830). + +### Observability + +* Show the progress of loading initial statistics [#53564](https://github.com/pingcap/tidb/issues/53564) @[hawkingrei](https://github.com/hawkingrei) + + TiDB loads basic statistics when it starts. In scenarios with many tables or partitions, this process can take a long time. When the configuration item [`force-init-stats`](/tidb-configuration-file.md#force-init-stats-new-in-v657-and-v710) is set to `ON`, TiDB does not provide services until the initial statistics are loaded. In this case, you need to observe the loading process to estimate the service start time. + + Starting from v8.3.0, TiDB prints the progress of loading initial statistics in stages in the log, so you can understand the running status. To provide formatted results to external tools, TiDB adds the additional [monitoring API](/tidb-monitoring-api.md) so you can obtain the progress of loading initial statistics at any time during the startup phase. + +* Add metrics about Request Unit (RU) settings [#8444](https://github.com/tikv/pd/issues/8444) @[nolouch](https://github.com/nolouch) + +### Security + +* Enhance PD log redaction [#8305](https://github.com/tikv/pd/issues/8305) @[JmPotato](https://github.com/JmPotato) + + TiDB v8.0.0 enhances log redaction and supports marking user data in TiDB logs with `‹ ›`. Based on the marked logs, you can decide whether to redact the marked information when displaying the logs, thus increasing the flexibility of log redaction. In v8.2.0, TiFlash implements a similar log redaction enhancement. + + In v8.3.0, PD implements a similar log redaction enhancement. To use this feature, you can set the value of the PD configuration item `security.redact-info-log` to `"marker"`. + + For more information, see [documentation](/log-redaction.md#log-redaction-in-pd-side). + +* Enhance TiKV log redaction [#17206](https://github.com/tikv/tikv/issues/17206) @[lucasliang](https://github.com/LykxSassinator) + + TiDB v8.0.0 enhances log redaction and supports marking user data in TiDB logs with `‹ ›`. Based on the marked logs, you can decide whether to redact the marked information when displaying the logs, thus increasing the flexibility of log redaction. In v8.2.0, TiFlash implements a similar log redaction enhancement. + + In v8.3.0, TiKV implements a similar log redaction enhancement. To use this feature, you can set the value of the TiKV configuration item `security.redact-info-log` to `"marker"`. + + For more information, see [documentation](/log-redaction.md#log-redaction-in-tikv-side). + +### Data migration + +* TiCDC supports replicating DDL statements in bidirectional replication (BDR) mode (GA) [#10301](https://github.com/pingcap/tiflow/issues/10301) [#48519](https://github.com/pingcap/tidb/issues/48519) @[okJiang](https://github.com/okJiang) @[asddongmen](https://github.com/asddongmen) + + TiCDC v7.6.0 introduced the replication of DDL statements with bidirectional replication configured. Previously, bidirectional replication of DDL statements was not supported by TiCDC, so users of TiCDC's bidirectional replication had to execute DDL statements on both TiDB clusters separately. With this feature, after assigning a `PRIMARY` BDR role to a cluster, TiCDC can replicate the DDL statements from that cluster to the `SECONDARY` cluster. + + In v8.3.0, this feature becomes generally available (GA). + + For more information, see [documentation](/ticdc/ticdc-bidirectional-replication.md). + +## Compatibility changes + +> **Note:** +> +> This section provides compatibility changes you need to know when you upgrade from v8.2.0 to the current version (v8.3.0). If you are upgrading from v8.1.0 or earlier versions to the current version, you might also need to check the compatibility changes introduced in intermediate versions. + +### Behavior changes + +* To avoid incorrect use of commands, `pd-ctl` cancels the prefix matching mechanism. For example, `store remove-tombstone` cannot be called via `store remove` [#8413](https://github.com/tikv/pd/issues/8413) @[lhy1024](https://github.com/lhy1024) + +### System variables + +| Variable name | Change type | Description | +|--------|------------------------------|------| +| [`tidb_ddl_reorg_batch_size`](/system-variables.md#tidb_ddl_reorg_batch_size) | Modified | Adds the SESSION scope. | +| [`tidb_ddl_reorg_worker_cnt`](/system-variables.md#tidb_ddl_reorg_worker_cnt) | Modified | Adds the SESSION scope. | +| [`tidb_enable_column_tracking`](/system-variables.md#tidb_enable_column_tracking-new-in-v540) | Modified | Changes the default value from `OFF` to `ON` after further tests, which means that TiDB collects `PREDICATE COLUMNS` by default. | +| [`tidb_gc_concurrency`](/system-variables.md#tidb_gc_concurrency-new-in-v50) | Modified | Starting from v8.3.0, this variable controls the number of concurrent threads during the [Resolve Locks](/garbage-collection-overview.md#resolve-locks) and [Delete Range](/garbage-collection-overview.md#delete-ranges) steps of the [Garbage Collection (GC)](/garbage-collection-overview.md) process. Before v8.3.0, this variable only controls the number of threads during the [Resolve Locks](/garbage-collection-overview.md#resolve-locks) step. | +| [`tidb_low_resolution_tso`](/system-variables.md#tidb_low_resolution_tso) | Modified | Adds the GLOBAL scope. | +| [`tidb_opt_projection_push_down`](/system-variables.md#tidb_opt_projection_push_down-new-in-v610) | Modified | Adds the GLOBAL scope and persists the variable value to the cluster. Changes the default value from `OFF` to `ON` after further tests, which means that the optimizer is allowed to push `Projection` down to the TiKV coprocessor. | +| [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-new-in-v800) | Modified | The range of values has been modified to either `0` or `[536870912, 9223372036854775807]`. The minimum value is `536870912` bytes (that is, 512 MiB) to avoid setting a cache size that is too small and causing performance degradation. | +| [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830) | Newly added | Controls the behavior of the `ANALYZE TABLE` statement. Setting it to the default value `PREDICATE` means only collecting statistics for [predicate columns](/statistics.md#collect-statistics-on-some-columns); setting it to `ALL` means collecting statistics for all columns. | +| [`tidb_enable_lazy_cursor_fetch`](/system-variables.md#tidb_enable_lazy_cursor_fetch-new-in-v830) | Newly added | Controls the behavior of the [Cursor Fetch](/develop/dev-guide-connection-parameters.md#use-streamingresult-to-get-the-execution-result) feature. | +| [`tidb_enable_shared_lock_promotion`](/system-variables.md#tidb_enable_shared_lock_promotion-new-in-v830) | Newly added | Controls whether to enable the feature of upgrading shared locks to exclusive locks. The default value of this variable is `OFF`, which means that the function of upgrading shared locks to exclusive locks is disabled. | +| [`tiflash_hashagg_preaggregation_mode`](/system-variables.md#tiflash_hashagg_preaggregation_mode-new-in-v830) | Newly added | Controls the pre-aggregation strategy used during the first stage of two-stage or three-stage HashAgg operations pushed down to TiFlash. | + +### Configuration file parameters + +| Configuration file | Configuration parameter | Change type | Description | +| -------- | -------- | -------- | -------- | +| TiDB | [`tikv-client.batch-policy`](/tidb-configuration-file.md#batch-policy-new-in-v830) | Newly added | Controls the batching strategy for requests from TiDB to TiKV. | +| PD | [`security.redact-info-log`](/pd-configuration-file.md#redact-info-log-new-in-v50) | Modified | Support setting the value of the PD configuration item `security.redact-info-log` to `"marker"` to mark sensitive information in the log with `‹ ›` instead of shielding it directly. With the `"marker"` option, you can customize the redaction rules. | +| TiKV | [`security.redact-info-log`](/tikv-configuration-file.md#redact-info-log-new-in-v408) | Modified | Support setting the value of the TiKV configuration item `security.redact-info-log` to `"marker"` to mark sensitive information in the log with `‹ ›` instead of shielding it directly. With the `"marker"` option, you can customize the redaction rules. | +| TiFlash | [`security.redact-info-log`](/tiflash/tiflash-configuration.md#configure-the-tiflash-learnertoml-file) | Modified | Support setting the value of the TiFlash Learner configuration item `security.redact-info-log` to `"marker"` to mark sensitive information in the log with `‹ ›` instead of shielding it directly. With the `"marker"` option, you can customize the redaction rules. | +| BR | [`--allow-pitr-from-incremental`](/br/br-incremental-guide.md#limitations) | Newly added | Controls whether incremental backups are compatible with subsequent log backups. The default value is `true`, which means that incremental backups are compatible with subsequent log backups. When you keep the default value `true`, the DDLs that need to be replayed are strictly checked before the incremental restore begins. | + +### System tables + +* The [`INFORMATION_SCHEMA.PROCESSLIST`](/information-schema/information-schema-processlist.md) and [`INFORMATION_SCHEMA.CLUSTER_PROCESSLIST`](/information-schema/information-schema-processlist.md#cluster_processlist) system tables add the `SESSION_ALIAS` field to show the number of rows currently affected by the DML statement [#46889](https://github.com/pingcap/tidb/issues/46889) @[lcwangchao](https://github.com/lcwangchao) + +## Deprecated features + +* The following features are deprecated starting from v8.3.0: + + * Starting from v7.5.0, [TiDB Binlog](https://docs.pingcap.com/tidb/v8.3/tidb-binlog-overview) replication is deprecated. Starting from v8.3.0, TiDB Binlog is fully deprecated, with removal planned for a future release. For incremental data replication, use [TiCDC](/ticdc/ticdc-overview.md) instead. For point-in-time recovery (PITR), use [PITR](/br/br-pitr-guide.md). + * Starting from v8.3.0, the [`tidb_enable_column_tracking`](/system-variables.md#tidb_enable_column_tracking-new-in-v540) system variable is deprecated. TiDB tracks predicate columns by default. For more information, see [`tidb_analyze_column_options`](/system-variables.md#tidb_analyze_column_options-new-in-v830). + +* The following features are planned for deprecation in future versions: + + * TiDB introduces the system variable [`tidb_enable_auto_analyze_priority_queue`](/system-variables.md#tidb_enable_auto_analyze_priority_queue-new-in-v800), which controls whether priority queues are enabled to optimize the ordering of tasks that automatically collect statistics. In future releases, the priority queue will be the only way to order tasks for automatically collecting statistics, so this system variable will be deprecated. + * TiDB introduces the system variable [`tidb_enable_async_merge_global_stats`](/system-variables.md#tidb_enable_async_merge_global_stats-new-in-v750) in v7.5.0. You can use it to set TiDB to use asynchronous merging of partition statistics to avoid OOM issues. In future releases, partition statistics will be merged asynchronously, so this system variable will be deprecated. + * It is planned to redesign [the automatic evolution of execution plan bindings](/sql-plan-management.md#baseline-evolution) in subsequent releases, and the related variables and behavior will change. + * In v8.0.0, TiDB introduces the [`tidb_enable_parallel_hashagg_spill`](/system-variables.md#tidb_enable_parallel_hashagg_spill-new-in-v800) system variable to control whether TiDB supports disk spill for the concurrent HashAgg algorithm. In future versions, the [`tidb_enable_parallel_hashagg_spill`](/system-variables.md#tidb_enable_parallel_hashagg_spill-new-in-v800) system variable will be deprecated. + * The TiDB Lightning parameter [`conflict.max-record-rows`](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task) is planned for deprecation in a future release and will be subsequently removed. This parameter will be replaced by [`conflict.threshold`](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task), which means that the maximum number of conflicting records is consistent with the maximum number of conflicting records that can be tolerated in a single import task. + +* The following features are planned for removal in future versions: + + * Starting from v8.0.0, TiDB Lightning deprecates the [old version of conflict detection](/tidb-lightning/tidb-lightning-physical-import-mode-usage.md#the-old-version-of-conflict-detection-deprecated-in-v800) strategy for the physical import mode, and enables you to control the conflict detection strategy for both logical and physical import modes via the [`conflict.strategy`](/tidb-lightning/tidb-lightning-configuration.md#tidb-lightning-task) parameter. The [`duplicate-resolution`](/tidb-lightning/tidb-lightning-configuration.md) parameter for the old version of conflict detection will be removed in a future release. + +## Improvements + ++ TiDB + + - Support the `SELECT ... STRAIGHT_JOIN ... USING ( ... )` statement [#54162](https://github.com/pingcap/tidb/issues/54162) @[dveeden](https://github.com/dveeden) + - Construct more precise index access ranges for filter conditions like `((idx_col_1 > 1) or (idx_col_1 = 1 and idx_col_2 > 10)) and ((idx_col_1 < 10) or (idx_col_1 = 10 and idx_col_2 < 20))` [#54337](https://github.com/pingcap/tidb/issues/54337) @[ghazalfamilyusa](https://github.com/ghazalfamilyusa) + - Use index order to avoid extra sorting operations for SQL queries like `WHERE idx_col_1 IS NULL ORDER BY idx_col_2` [#54188](https://github.com/pingcap/tidb/issues/54188) @[ari-e](https://github.com/ari-e) + - Display analyzed indexes in the `mysql.analyze_jobs` system table [#53567](https://github.com/pingcap/tidb/issues/53567) @[hi-rustin](https://github.com/Rustin170506) + - Support applying the `tidb_redact_log` setting to the output of `EXPLAIN` statements and further optimize the logic in processing logs [#54565](https://github.com/pingcap/tidb/issues/54565) @[hawkingrei](https://github.com/hawkingrei) + - Support generating the `Selection` operator on `IndexRangeScan` for multi-valued indexes to improve query efficiency [#54876](https://github.com/pingcap/tidb/issues/54876) @[time-and-fate](https://github.com/time-and-fate) + - Support killing automatic `ANALYZE` tasks that are running outside the set time window [#55283](https://github.com/pingcap/tidb/issues/55283) @[hawkingrei](https://github.com/hawkingrei) + - Adjust estimation results from 0 to 1 for equality conditions that do not hit TopN when statistics are entirely composed of TopN and the modified row count in the corresponding table statistics is non-zero [#47400](https://github.com/pingcap/tidb/issues/47400) @[terry1purcell](https://github.com/terry1purcell) + - The TopN operator supports disk spill [#47733](https://github.com/pingcap/tidb/issues/47733) @[xzhangxian1008](https://github.com/xzhangxian1008) + - TiDB node supports executing queries with the `WITH ROLLUP` modifier and the `GROUPING` function [#42631](https://github.com/pingcap/tidb/issues/42631) @[Arenatlx](https://github.com/Arenatlx) + - The system variable [`tidb_low_resolution_tso`](/system-variables.md#tidb_low_resolution_tso) supports the `GLOBAL` scope [#55022](https://github.com/pingcap/tidb/issues/55022) @[cfzjywxk](https://github.com/cfzjywxk) + - Improve GC (Garbage Collection) efficiency by supporting concurrent range deletion. You can control the number of concurrent threads using [`tidb_gc_concurrency`](/system-variables.md#tidb_gc_concurrency-new-in-v50) [#54570](https://github.com/pingcap/tidb/issues/54570) @[ekexium](https://github.com/ekexium) + - Improve the performance of bulk DML execution mode (`tidb_dml_type = "bulk"`) [#50215](https://github.com/pingcap/tidb/issues/50215) @[ekexium](https://github.com/ekexium) + - Improve the performance of schema information cache-related interface `SchemaByID` [#54074](https://github.com/pingcap/tidb/issues/54074) @[ywqzzy](https://github.com/ywqzzy) + - Improve the query performance for certain system tables when schema information caching is enabled [#50305](https://github.com/pingcap/tidb/issues/50305) @[tangenta](https://github.com/tangenta) + - Optimize error messages for conflicting keys when adding unique indexes [#53004](https://github.com/pingcap/tidb/issues/53004) @[lance6716](https://github.com/lance6716) + ++ PD + + - Support modifying the `batch` configuration of the `evict-leader-scheduler` via `pd-ctl` to accelerate the leader eviction process [#8265](https://github.com/tikv/pd/issues/8265) @[rleungx](https://github.com/rleungx) + - Add the `store_id` monitoring metric to the **Cluster > Label distribution** panel in Grafana to display store IDs corresponding to different labels [#8337](https://github.com/tikv/pd/issues/8337) @[HuSharp](https://github.com/HuSharp) + - Support fallback to the default resource group when the specified resource group does not exist [#8388](https://github.com/tikv/pd/issues/8388) @[JmPotato](https://github.com/JmPotato) + - Add the `approximate_kv_size` field to the Region information output by the `region` command in `pd-ctl` [#8412](https://github.com/tikv/pd/issues/8412) @[zeminzhou](https://github.com/zeminzhou) + - Optimize the message that returns when you call the PD API to delete the TTL configuration [#8450](https://github.com/tikv/pd/issues/8450) @[lhy1024](https://github.com/lhy1024) + - Optimize the RU consumption behavior of large query read requests to reduce the impact on other requests [#8457](https://github.com/tikv/pd/issues/8457) @[nolouch](https://github.com/nolouch) + - Optimize the error message that returns when you misconfigure PD microservices [#52912](https://github.com/pingcap/tidb/issues/52912) @[rleungx](https://github.com/rleungx) + - Add the `--name` startup parameter to PD microservices to more accurately display the service name during deployment [#7995](https://github.com/tikv/pd/issues/7995) @[HuSharp](https://github.com/HuSharp) + - Support dynamically adjusting `PatrolRegionScanLimit` based on the number of Regions to reduce Region scan time [#7963](https://github.com/tikv/pd/issues/7963) @[lhy1024](https://github.com/lhy1024) + ++ TiKV + + - Optimize the batching policy for writing Raft logs when `async-io` is enabled to reduce the consumption of disk I/O bandwidth resources [#16907](https://github.com/tikv/tikv/issues/16907) @[LykxSassinator](https://github.com/LykxSassinator) + - Redesign the TiCDC delegate and downstream modules to better support Region partial subscription [#16362](https://github.com/tikv/tikv/issues/16362) @[hicqu](https://github.com/hicqu) + - Reduce the size of a single slow query log [#17294](https://github.com/tikv/tikv/issues/17294) @[Connor1996](https://github.com/Connor1996) + - Add a new monitoring metric `min safe ts` [#17307](https://github.com/tikv/tikv/issues/17307) @[mittalrishabh](https://github.com/mittalrishabh) + - Reduce the memory usage of the peer message channel [#16229](https://github.com/tikv/tikv/issues/16229) @[Connor1996](https://github.com/Connor1996) + ++ TiFlash + + - Support generating ad hoc heap profiling in SVG format [#9320](https://github.com/pingcap/tiflash/issues/9320) @[CalvinNeo](https://github.com/CalvinNeo) + ++ Tools + + + Backup & Restore (BR) + + - Support checking whether a full backup exists before starting point-in-time recovery (PITR) for the first time. If the full backup is not found, BR terminates the restore and returns an error [#54418](https://github.com/pingcap/tidb/issues/54418) @[Leavrth](https://github.com/Leavrth) + - Support checking whether the disk space in TiKV and TiFlash is sufficient before restoring snapshot backups. If the space is insufficient, BR terminates the restore and returns an error [#54316](https://github.com/pingcap/tidb/issues/54316) @[RidRisR](https://github.com/RidRisR) + - Support checking whether the disk space in TiKV is sufficient before TiKV downloads each SST file. If the space is insufficient, BR terminates the restore and returns an error [#17224](https://github.com/tikv/tikv/issues/17224) @[RidRisR](https://github.com/RidRisR) + - Support setting Alibaba Cloud access credentials through environment variables [#45551](https://github.com/pingcap/tidb/issues/45551) @[RidRisR](https://github.com/RidRisR) + - Automatically set the environment variable `GOMEMLIMIT` based on the available memory of the BR process to avoid OOM when using BR for backup and restore [#53777](https://github.com/pingcap/tidb/issues/53777) @[Leavrth](https://github.com/Leavrth) + - Make incremental backups compatible with point-in-time recovery (PITR) [#54474](https://github.com/pingcap/tidb/issues/54474) @[3pointer](https://github.com/3pointer) + - Support backing up and restoring the `mysql.column_stats_usage` table [#53567](https://github.com/pingcap/tidb/issues/53567) @[hi-rustin](https://github.com/Rustin170506) + +## Bug fixes + ++ TiDB + + - Reset the parameters in the `Open` method of `PipelinedWindow` to fix the unexpected error that occurs when the `PipelinedWindow` is used as a child node of `Apply` due to the reuse of previous parameter values caused by repeated opening and closing operations [#53600](https://github.com/pingcap/tidb/issues/53600) @[XuHuaiyu](https://github.com/XuHuaiyu) + - Fix the issue that the query might get stuck when terminated because the memory usage exceeds the limit set by `tidb_mem_quota_query` [#55042](https://github.com/pingcap/tidb/issues/55042) @[yibin87](https://github.com/yibin87) + - Fix the issue that the disk spill for the HashAgg operator causes incorrect query results during parallel calculation [#55290](https://github.com/pingcap/tidb/issues/55290) @[xzhangxian1008](https://github.com/xzhangxian1008) + - Fix the issue of wrong `JSON_TYPE` when casting `YEAR` to JSON format [#54494](https://github.com/pingcap/tidb/issues/54494) @[YangKeao](https://github.com/YangKeao) + - Fix the issue that the value range of the `tidb_schema_cache_size` system variable is wrong [#54034](https://github.com/pingcap/tidb/issues/54034) @[lilinghai](https://github.com/lilinghai) + - Fix the issue that partition pruning does not work when the partition expression is `EXTRACT(YEAR FROM col)` [#54210](https://github.com/pingcap/tidb/issues/54210) @[mjonss](https://github.com/mjonss) + - Fix the issue that `FLASHBACK DATABASE` fails when many tables exist in the database [#54415](https://github.com/pingcap/tidb/issues/54415) @[lance6716](https://github.com/lance6716) + - Fix the issue that `FLASHBACK DATABASE` enters an infinite loop when handling many databases [#54915](https://github.com/pingcap/tidb/issues/54915) @[lance6716](https://github.com/lance6716) + - Fix the issue that adding an index in index acceleration mode might fail [#54568](https://github.com/pingcap/tidb/issues/54568) @[lance6716](https://github.com/lance6716) + - Fix the issue that `ADMIN CANCEL DDL JOBS` might cause DDL to fail [#54687](https://github.com/pingcap/tidb/issues/54687) @[lance6716](https://github.com/lance6716) + - Fix the issue that table replication fails when the index length of the table replicated from DM exceeds the maximum length specified by `max-index-length` [#55138](https://github.com/pingcap/tidb/issues/55138) @[lance6716](https://github.com/lance6716) + - Fix the issue that the error `runtime error: index out of range` might occur when executing SQL statements with `tidb_enable_inl_join_inner_multi_pattern` enabled [#54535](https://github.com/pingcap/tidb/issues/54535) @[joechenrh](https://github.com/joechenrh) + - Fix the issue that you cannot exit TiDB using Control+C during the process of initializing statistics [#54589](https://github.com/pingcap/tidb/issues/54589) @[tiancaiamao](https://github.com/tiancaiamao) + - Fix the issue that the `INL_MERGE_JOIN` optimizer hint returns incorrect results by deprecating it [#54064](https://github.com/pingcap/tidb/issues/54064) @[AilinKid](https://github.com/AilinKid) + - Fix the issue that a correlated subquery that contains `WITH ROLLUP` might cause TiDB to panic and return the error `runtime error: index out of range` [#54983](https://github.com/pingcap/tidb/issues/54983) @[AilinKid](https://github.com/AilinKid) + - Fix the issue that predicates cannot be pushed down properly when the filter condition of a SQL query contains virtual columns and the execution condition contains `UnionScan` [#54870](https://github.com/pingcap/tidb/issues/54870) @[qw4990](https://github.com/qw4990) + - Fix the issue that the error `runtime error: invalid memory address or nil pointer dereference` might occur when executing SQL statements with `tidb_enable_inl_join_inner_multi_pattern` enabled [#55169](https://github.com/pingcap/tidb/issues/55169) @[hawkingrei](https://github.com/hawkingrei) + - Fix the issue that a query statement that contains `UNION` might return incorrect results [#52985](https://github.com/pingcap/tidb/issues/52985) @[XuHuaiyu](https://github.com/XuHuaiyu) + - Fix the issue that the `tot_col_size` column in the `mysql.stats_histograms` table might be a negative number [#55126](https://github.com/pingcap/tidb/issues/55126) @[qw4990](https://github.com/qw4990) + - Fix the issue that `columnEvaluator` cannot identify the column references in the input chunk, which leads to `runtime error: index out of range` when executing SQL statements [#53713](https://github.com/pingcap/tidb/issues/53713) @[AilinKid](https://github.com/AilinKid) + - Fix the issue that `STATS_EXTENDED` becomes a reserved keyword [#39573](https://github.com/pingcap/tidb/issues/39573) @[wddevries](https://github.com/wddevries) + - Fix the issue that when `tidb_low_resolution` is enabled, `select for update` can be executed [#54684](https://github.com/pingcap/tidb/issues/54684) @[cfzjywxk](https://github.com/cfzjywxk) + - Fix the issue that internal SQL queries cannot be displayed in the slow query log when `tidb_redact_log` is enabled [#54190](https://github.com/pingcap/tidb/issues/54190) @[lcwangchao](https://github.com/lcwangchao) + - Fix the issue that the memory used by transactions might be tracked multiple times [#53984](https://github.com/pingcap/tidb/issues/53984) @[ekexium](https://github.com/ekexium) + - Fix the issue that using `SHOW WARNINGS;` to obtain warnings might cause a panic [#48756](https://github.com/pingcap/tidb/issues/48756) @[xhebox](https://github.com/xhebox) + - Fix the issue that loading index statistics might cause memory leaks [#54022](https://github.com/pingcap/tidb/issues/54022) @[hi-rustin](https://github.com/Rustin170506) + - Fix the issue that the `LENGTH()` condition is unexpectedly removed when the collation is `utf8_bin` or `utf8mb4_bin` [#53730](https://github.com/pingcap/tidb/issues/53730) @[elsa0520](https://github.com/elsa0520) + - Fix the issue that statistics collection does not update the `stats_history` table when encountering duplicate primary keys [#47539](https://github.com/pingcap/tidb/issues/47539) @[Defined2014](https://github.com/Defined2014) + - Fix the issue that recursive CTE queries might result in invalid pointers [#54449](https://github.com/pingcap/tidb/issues/54449) @[hawkingrei](https://github.com/hawkingrei) + - Fix the issue that the Connection Count monitoring metric in Grafana is incorrect when some connections exit before the handshake is complete [#54428](https://github.com/pingcap/tidb/issues/54428) @[YangKeao](https://github.com/YangKeao) + - Fix the issue that the Connection Count of each resource group is incorrect when using TiProxy and resource groups [#54545](https://github.com/pingcap/tidb/issues/54545) @[YangKeao](https://github.com/YangKeao) + - Fix the issue that when queries contain non-correlated subqueries and `LIMIT` clauses, column pruning might be incomplete, resulting in a less optimal plan [#54213](https://github.com/pingcap/tidb/issues/54213) @[qw4990](https://github.com/qw4990) + - Fix the issue of reusing wrong point get plans for `SELECT ... FOR UPDATE` [#54652](https://github.com/pingcap/tidb/issues/54652) @[qw4990](https://github.com/qw4990) + - Fix the issue that the `TIMESTAMPADD()` function goes into an infinite loop when the first argument is `month` and the second argument is negative [#54908](https://github.com/pingcap/tidb/issues/54908) @[xzhangxian1008](https://github.com/xzhangxian1008) + - Fix the issue that internal SQL statements in the slow log are redacted to null by default [#54190](https://github.com/pingcap/tidb/issues/54190) [#52743](https://github.com/pingcap/tidb/issues/52743) [#53264](https://github.com/pingcap/tidb/issues/53264) @[lcwangchao](https://github.com/lcwangchao) + - Fix the issue that `PointGet` execution plans for `_tidb_rowid` can be generated [#54583](https://github.com/pingcap/tidb/issues/54583) @[Defined2014](https://github.com/Defined2014) + - Fix the issue that `SHOW IMPORT JOBS` reports an error `Unknown column 'summary'` after upgrading from v7.1 [#54241](https://github.com/pingcap/tidb/issues/54241) @[tangenta](https://github.com/tangenta) + - Fix the issue that obtaining the column information using `information_schema.columns` returns warning 1356 when a subquery is used as a column definition in a view definition [#54343](https://github.com/pingcap/tidb/issues/54343) @[lance6716](https://github.com/lance6716) + - Fix the issue that RANGE partitioned tables that are not strictly self-incrementing can be created [#54829](https://github.com/pingcap/tidb/issues/54829) @[Defined2014](https://github.com/Defined2014) + - Fix the issue that `INDEX_HASH_JOIN` cannot exit properly when SQL is abnormally interrupted [#54688](https://github.com/pingcap/tidb/issues/54688) @[wshwsh12](https://github.com/wshwsh12) + - Fix the issue that the network partition during adding indexes using the Distributed eXecution Framework (DXF) might cause inconsistent data indexes [#54897](https://github.com/pingcap/tidb/issues/54897) @[tangenta](https://github.com/tangenta) + ++ PD + + - Fix the issue that no error is reported when binding a role to a resource group [#54417](https://github.com/pingcap/tidb/issues/54417) @[JmPotato](https://github.com/JmPotato) + - Fix the issue that a resource group encounters quota limits when requesting tokens for more than 500 ms [#8349](https://github.com/tikv/pd/issues/8349) @[nolouch](https://github.com/nolouch) + - Fix the issue that the time data type in the `INFORMATION_SCHEMA.RUNAWAY_WATCHES` table is incorrect [#54770](https://github.com/pingcap/tidb/issues/54770) @[HuSharp](https://github.com/HuSharp) + - Fix the issue that resource groups could not effectively limit resource usage under high concurrency [#8435](https://github.com/tikv/pd/issues/8435) @[nolouch](https://github.com/nolouch) + - Fix the issue that an incorrect PD API is called when you retrieve table attributes [#55188](https://github.com/pingcap/tidb/issues/55188) @[JmPotato](https://github.com/JmPotato) + - Fix the issue that the scaling progress is displayed incorrectly after the `scheduling` microservice is enabled [#8331](https://github.com/tikv/pd/issues/8331) @[rleungx](https://github.com/rleungx) + - Fix the issue that the encryption manager is not initialized before use [#8384](https://github.com/tikv/pd/issues/8384) @[rleungx](https://github.com/rleungx) + - Fix the issue that some logs are not redacted [#8419](https://github.com/tikv/pd/issues/8419) @[rleungx](https://github.com/rleungx) + - Fix the issue that redirection might panic during the startup of PD microservices [#8406](https://github.com/tikv/pd/issues/8406) @[HuSharp](https://github.com/HuSharp) + - Fix the issue that the `split-merge-interval` configuration item might not take effect when you modify its value repeatedly (such as changing it from `1s` to `1h` and back to `1s`) [#8404](https://github.com/tikv/pd/issues/8404) @[lhy1024](https://github.com/lhy1024) + - Fix the issue that setting `replication.strictly-match-label` to `true` causes TiFlash to fail to start [#8480](https://github.com/tikv/pd/issues/8480) @[rleungx](https://github.com/rleungx) + - Fix the issue that fetching TSO is slow when analyzing large partitioned tables, causing `ANALYZE` performance degradation [#8500](https://github.com/tikv/pd/issues/8500) @[rleungx](https://github.com/rleungx) + - Fix the potential data races in large clusters [#8386](https://github.com/tikv/pd/issues/8386) @[rleungx](https://github.com/rleungx) + - Fix the issue that when determining whether queries are Runaway Queries, TiDB only counts time consumption spent on the Coprocessor side while missing time consumption spent on the TiDB side, resulting in some queries not being identified as Runaway Queries [#51325](https://github.com/pingcap/tidb/issues/51325) @[HuSharp](https://github.com/HuSharp) + ++ TiFlash + + - Fix the issue that when using the `CAST()` function to convert a string to a datetime with a time zone or invalid characters, the result is incorrect [#8754](https://github.com/pingcap/tiflash/issues/8754) @[solotzg](https://github.com/solotzg) + - Fix the issue that TiFlash might panic after executing `RENAME TABLE ... TO ...` on a partitioned table with empty partitions across databases [#9132](https://github.com/pingcap/tiflash/issues/9132) @[JaySon-Huang](https://github.com/JaySon-Huang) + - Fix the issue that some queries might report a column type mismatch error after late materialization is enabled [#9175](https://github.com/pingcap/tiflash/issues/9175) @[JinheLin](https://github.com/JinheLin) + - Fix the issue that queries with virtual generated columns might return incorrect results after late materialization is enabled [#9188](https://github.com/pingcap/tiflash/issues/9188) @[JinheLin](https://github.com/JinheLin) + - Fix the issue that setting the SSL certificate configuration to an empty string in TiFlash incorrectly enables TLS and causes TiFlash to fail to start [#9235](https://github.com/pingcap/tiflash/issues/9235) @[JaySon-Huang](https://github.com/JaySon-Huang) + - Fix the issue that TiFlash might panic when a database is deleted shortly after creation [#9266](https://github.com/pingcap/tiflash/issues/9266) @[JaySon-Huang](https://github.com/JaySon-Huang) + - Fix the issue that a network partition (network disconnection) between TiFlash and any PD might cause read request timeout errors [#9243](https://github.com/pingcap/tiflash/issues/9243) @[Lloyd-Pottiger](https://github.com/Lloyd-Pottiger) + - Fix the issue that TiFlash write nodes might fail to restart in the disaggregated storage and compute architecture [#9282](https://github.com/pingcap/tiflash/issues/9282) @[JaySon-Huang](https://github.com/JaySon-Huang) + - Fix the issue that read snapshots of TiFlash write nodes are not released in a timely manner in the disaggregated storage and compute architecture [#9298](https://github.com/pingcap/tiflash/issues/9298) @[JinheLin](https://github.com/JinheLin) + ++ TiKV + + - Fix the issue that cleaning up stale regions might accidentally delete valid data [#17258](https://github.com/tikv/tikv/issues/17258) @[hbisheng](https://github.com/hbisheng) + - Fix the issue that `Ingestion picked level` and `Compaction Job Size(files)` are displayed incorrectly in the TiKV dashboard in Grafana [#15990](https://github.com/tikv/tikv/issues/15990) @[Connor1996](https://github.com/Connor1996) + - Fix the issue that `cancel_generating_snap` incorrectly updating `snap_tried_cnt` causes TiKV to panic [#17226](https://github.com/tikv/tikv/issues/17226) @[hbisheng](https://github.com/hbisheng) + - Fix the issue that the information of `Ingest SST duration seconds` is incorrect [#17239](https://github.com/tikv/tikv/issues/17239) @[LykxSassinator](https://github.com/LykxSassinator) + - Fix the issue that CPU profiling flag is not reset correctly when an error occurs [#17234](https://github.com/tikv/tikv/issues/17234) @[Connor1996](https://github.com/Connor1996) + - Fix the issue that bloom filters are incompatible between earlier versions (earlier than v7.1) and later versions [#17272](https://github.com/tikv/tikv/issues/17272) @[v01dstar](https://github.com/v01dstar) + ++ Tools + + + Backup & Restore (BR) + + - Fix the issue that DDLs requiring backfilling, such as `ADD INDEX` and `MODIFY COLUMN`, might not be correctly recovered during incremental restore [#54426](https://github.com/pingcap/tidb/issues/54426) @[3pointer](https://github.com/3pointer) + - Fix the issue that the progress is stuck during backup and restore [#54140](https://github.com/pingcap/tidb/issues/54140) @[Leavrth](https://github.com/Leavrth) + - Fix the issue that the checkpoint path of backup and restore is incompatible with some external storage [#55265](https://github.com/pingcap/tidb/issues/55265) @[Leavrth](https://github.com/Leavrth) + + + TiCDC + + - Fix the issue that the processor might get stuck when the downstream Kafka is inaccessible [#11340](https://github.com/pingcap/tiflow/issues/11340) @[asddongmen](https://github.com/asddongmen) + + + TiDB Data Migration (DM) + + - Fix the issue that schema tracker incorrectly handles LIST partition tables, causing DM errors [#11408](https://github.com/pingcap/tiflow/issues/11408) @[lance6716](https://github.com/lance6716) + - Fix the issue that data replication is interrupted when the index length exceeds the default value of `max-index-length` [#11459](https://github.com/pingcap/tiflow/issues/11459) @[michaelmdeng](https://github.com/michaelmdeng) + - Fix the issue that DM cannot handle `FAKE_ROTATE_EVENT` correctly [#11381](https://github.com/pingcap/tiflow/issues/11381) @[lance6716](https://github.com/lance6716) + + + TiDB Lightning + + - Fix the issue that TiDB Lightning outputs a confusing `WARN` log when it fails to obtain the keyspace name [#54232](https://github.com/pingcap/tidb/issues/54232) @[kennytm](https://github.com/kennytm) + - Fix the issue that the TLS configuration of TiDB Lightning affects cluster certificates [#54172](https://github.com/pingcap/tidb/issues/54172) @[ei-sugimoto](https://github.com/ei-sugimoto) + - Fix the issue that transaction conflicts occur during data import using TiDB Lightning [#49826](https://github.com/pingcap/tidb/issues/49826) @[lance6716](https://github.com/lance6716) + - Fix the issue that large checkpoint files cause performance degradation during the import of numerous databases and tables [#55054](https://github.com/pingcap/tidb/issues/55054) @[D3Hunter](https://github.com/D3Hunter) + +## Contributors + +We would like to thank the following contributors from the TiDB community: + +- [ari-e](https://github.com/ari-e) +- [ei-sugimoto](https://github.com/ei-sugimoto) +- [HaoW30](https://github.com/HaoW30) +- [JackL9u](https://github.com/JackL9u) +- [michaelmdeng](https://github.com/michaelmdeng) +- [mittalrishabh](https://github.com/mittalrishabh) +- [qingfeng777](https://github.com/qingfeng777) +- [SandeepPadhi](https://github.com/SandeepPadhi) +- [yzhan1](https://github.com/yzhan1) diff --git a/system-variables.md b/system-variables.md index 3692fd799b35b..03699bc49556f 100644 --- a/system-variables.md +++ b/system-variables.md @@ -5327,13 +5327,13 @@ For details, see [Identify Slow Queries](/identify-slow-queries.md). -- This variable is used to configure the different cluster IDs in a [bi-directional replication](/ticdc/ticdc-bidirectional-replication.md) cluster. +- This variable is used to configure the different cluster IDs in a [bidirectional replication](/ticdc/ticdc-bidirectional-replication.md) cluster. -- This variable is used to configure the different cluster IDs in a [bi-directional replication](https://docs.pingcap.com/tidb/stable/ticdc-bidirectional-replication) cluster. +- This variable is used to configure the different cluster IDs in a [bidirectional replication](https://docs.pingcap.com/tidb/stable/ticdc-bidirectional-replication) cluster. diff --git a/ticdc/ticdc-bidirectional-replication.md b/ticdc/ticdc-bidirectional-replication.md index 9c9af87547fcf..7b83d6ed591d7 100644 --- a/ticdc/ticdc-bidirectional-replication.md +++ b/ticdc/ticdc-bidirectional-replication.md @@ -5,13 +5,13 @@ summary: Learn how to use bidirectional replication of TiCDC. # Bidirectional Replication -TiCDC supports bi-directional replication (BDR) among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC. +TiCDC supports bidirectional replication (BDR) among two TiDB clusters. Based on this feature, you can create a multi-active TiDB solution using TiCDC. -This section describes how to use bi-directional replication taking two TiDB clusters as an example. +This section describes how to use bidirectional replication taking two TiDB clusters as an example. -## Deploy bi-directional replication +## Deploy bidirectional replication -TiCDC only replicates incremental data changes that occur after a specified timestamp to the downstream cluster. Before starting the bi-directional replication, you need to take the following steps: +TiCDC only replicates incremental data changes that occur after a specified timestamp to the downstream cluster. Before starting the bidirectional replication, you need to take the following steps: 1. (Optional) According to your needs, import the data of the two TiDB clusters into each other using the data export tool [Dumpling](/dumpling-overview.md) and data import tool [TiDB Lightning](/tidb-lightning/tidb-lightning-overview.md). @@ -21,26 +21,26 @@ TiCDC only replicates incremental data changes that occur after a specified time 3. Specify the starting time point of data replication for the upstream and downstream clusters. - 1. Check the time point of the upstream and downstream clusters. In the case of two TiDB clusters, make sure that data in the two clusters are consistent at certain time points. For example, the data of TiDB A at `ts=1` and the data of TiDB B at `ts=2` are consistent. + 1. Check the time point of the upstream and downstream clusters. In the case of two TiDB clusters, make sure that data in the two clusters are consistent at certain time points. For example, the data of TiDB 1 at `ts=1` and the data of TiDB 2 at `ts=2` are consistent. - 2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB A, set `--start-ts=1`; if the upstream cluster is TiDB B, set `--start-ts=2`. + 2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the downstream cluster is TiDB 2, set `--start-ts=2`. 4. In the configuration file specified by the `--config` parameter, add the following configuration: ```toml - # Whether to enable the bi-directional replication mode + # Whether to enable the bidirectional replication mode bdr-mode = true ``` -After the configuration takes effect, the clusters can perform bi-directional replication. +After the configuration takes effect, the clusters can perform bidirectional replication. ## DDL types -Starting from v7.6.0, to support DDL replication as much as possible in bi-directional replication, TiDB divides the [DDLs that TiCDC originally supports](/ticdc/ticdc-ddl.md) into two types: replicable DDLs and non-replicable DDLs, according to the impact of DDLs on the business. +Starting from v7.6.0, to support DDL replication as much as possible in bidirectional replication, TiDB divides the [DDLs that TiCDC originally supports](/ticdc/ticdc-ddl.md) into two types: replicable DDLs and non-replicable DDLs, according to the impact of DDLs on the business. ### Replicable DDLs -Replicable DDLs are the DDLs that can be directly executed and replicated to other TiDB clusters in bi-directional replication. +Replicable DDLs are the DDLs that can be directly executed and replicated to other TiDB clusters in bidirectional replication. Replicable DDLs include: @@ -63,7 +63,7 @@ Replicable DDLs include: ### Non-replicable DDLs -Non-replicable DDLs are the DDLs that have a great impact on the business, and might cause data inconsistency between clusters. Non-replicable DDLs cannot be directly replicated to other TiDB clusters in bi-directional replication through TiCDC. Non-replicable DDLs must be executed through specific operations. +Non-replicable DDLs are the DDLs that have a great impact on the business, and might cause data inconsistency between clusters. Non-replicable DDLs cannot be directly replicated to other TiDB clusters in bidirectional replication through TiCDC. Non-replicable DDLs must be executed through specific operations. Non-replicable DDLs include: @@ -137,11 +137,11 @@ When no BDR role is set, you can execute any DDL. But after you set `bdr_mode=tr > > After you execute `ADMIN UNSET BDR ROLE` on all TiDB clusters, none of the DDLs are replicated by TiCDC. You need to manually execute the DDLs on each cluster separately. -## Stop bi-directional replication +## Stop bidirectional replication After the application has stopped writing data, you can insert a special record into each cluster. By checking the two special records, you can make sure that data in two clusters are consistent. -After the check is completed, you can stop the changefeed to stop bi-directional replication, and execute `ADMIN UNSET BDR ROLE` on all TiDB clusters. +After the check is completed, you can stop the changefeed to stop bidirectional replication, and execute `ADMIN UNSET BDR ROLE` on all TiDB clusters. ## Limitations @@ -154,7 +154,11 @@ After the check is completed, you can stop the changefeed to stop bi-directional > > Do not set the BDR role in other scenarios, for example, setting `PRIMARY`, `SECONDARY`, and no BDR roles at the same time. If you set the BDR role incorrectly, TiDB cannot guarantee data correctness and consistency during data replication. +<<<<<<< HEAD - Usually do not use `AUTO_INCREMENT` or `AUTO_RANDOM` to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bi-directional replication, you can set them as follows: +======= +- Usually do not use [`AUTO_INCREMENT`](/auto-increment.md) or [`AUTO_RANDOM`](/auto-random.md) to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bidirectional replication, you can set them as follows: +>>>>>>> 1e9dac24d7 (ticdc: fix wrong names in bidirectional replication (#21928)) - In Cluster A, set `auto_increment_increment=3` and `auto_increment_offset=2000` - In Cluster B, set `auto_increment_increment=3` and `auto_increment_offset=2001` @@ -162,6 +166,6 @@ After the check is completed, you can stop the changefeed to stop bi-directional This way, A, B, and C will not conflict with each other in the implicitly assigned `AUTO_INCREMENT` ID and `AUTO_RANDOM` ID. If you need to add a cluster in BDR mode, you need to temporarily stop writing data of the related application, set the appropriate values for `auto_increment_increment` and `auto_increment_offset` on all clusters, and then resume writing data of the related application. -- Bi-directional replication clusters cannot detect write conflicts, which might cause undefined behaviors. Therefore, you must ensure that there are no write conflicts from the application side. +- Bidirectional replication clusters cannot detect write conflicts, which might cause undefined behaviors. Therefore, you must ensure that there are no write conflicts from the application side. -- Bi-directional replication supports more than two clusters, but does not support multiple clusters in cascading mode, that is, a cyclic replication like TiDB A -> TiDB B -> TiDB C -> TiDB A. In such a topology, if one cluster fails, the whole data replication will be affected. Therefore, to enable bi-directional replication among multiple clusters, you need to connect each cluster with every other clusters, for example, `TiDB A <-> TiDB B`, `TiDB B <-> TiDB C`, `TiDB C <-> TiDB A`. +- Bidirectional replication supports more than two clusters, but does not support multiple clusters in cascading mode, that is, a cyclic replication like TiDB A -> TiDB B -> TiDB C -> TiDB A. In such a topology, if one cluster fails, the whole data replication will be affected. Therefore, to enable bidirectional replication among multiple clusters, you need to connect each cluster with every other clusters, for example, `TiDB A <-> TiDB B`, `TiDB B <-> TiDB C`, `TiDB C <-> TiDB A`. diff --git a/ticdc/ticdc-changefeed-config.md b/ticdc/ticdc-changefeed-config.md index 66fe73596bb90..f09b625059e01 100644 --- a/ticdc/ticdc-changefeed-config.md +++ b/ticdc/ticdc-changefeed-config.md @@ -36,6 +36,442 @@ Info: {"upstream_id":7178706266519722477,"namespace":"default","id":"simple-repl This section introduces the configuration of a replication task. +<<<<<<< HEAD +======= +### `memory-quota` + +- Specifies the memory quota (in bytes) that can be used in the capture server by the sink manager. If the value is exceeded, the overused part will be recycled by the go runtime. +- Default value: `1073741824` (1 GiB) + +### `case-sensitive` + +- Specifies whether the database names and tables in the configuration file are case-sensitive. Starting from v6.5.6, v7.1.3, and v7.5.0, the default value changes from `true` to `false`. +- This configuration item affects configurations related to filter and sink. +- Default value: `false` + +### `force-replicate` + +- Specifies whether to forcibly [replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index). +- Default value: `false` + +### `enable-sync-point` New in v6.3.0 + +- Specifies whether to enable the Syncpoint feature, which is supported starting from v6.3.0 and is disabled by default. +- Starting from v6.4.0, only the changefeed with the `SYSTEM_VARIABLES_ADMIN` or `SUPER` privilege can use the TiCDC Syncpoint feature. +- This configuration item only takes effect if the downstream is TiDB. +- Default value: `false` + +### `sync-point-interval` + +- Specifies the interval at which Syncpoint aligns the upstream and downstream snapshots. +- This configuration item only takes effect if the downstream is TiDB. +- The format is `"h m s"`. For example, `"1h30m30s"`. +- Default value: `"10m"` +- Minimum value: `"30s"` + +### `sync-point-retention` + +- Specifies how long the data is retained by Syncpoint in the downstream table. When this duration is exceeded, the data is cleaned up. +- This configuration item only takes effect if the downstream is TiDB. +- The format is `"h m s"`. For example, `"24h30m30s"`. +- Default value: `"24h"` + +### `sql-mode` New in v6.5.6, v7.1.3, and v7.5.0 + +- Specifies the [SQL mode](/sql-mode.md) used when parsing DDL statements. Multiple modes are separated by commas. +- Default value: `"ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"`, which is the same as the default SQL mode of TiDB + +### `bdr-mode` + +- To set up BDR (Bidirectional replication) clusters using TiCDC, modify this parameter to `true` and set the TiDB clusters to BDR mode. For more information, see [Bidirectional Replication](/ticdc/ticdc-bidirectional-replication.md#bidirectional-replication). +- Default value: `false`, indicating that bidirectional replication (BDR) mode is not enabled + +### `changefeed-error-stuck-duration` + +- Specifies the duration for which the changefeed is allowed to automatically retry when internal errors or exceptions occur. +- The changefeed enters the failed state if internal errors or exceptions occur in the changefeed and persist longer than the duration set by this parameter. +- When the changefeed is in the failed state, you need to restart the changefeed manually for recovery. +- The format is `"h m s"`. For example, `"1h30m30s"`. +- Default value: `"30m"` + +### mounter + +#### `worker-num` + +- Specifies the number of threads with which the mounter decodes KV data. +- Default value: `16` + +### filter + +#### `ignore-txn-start-ts` + +- Ignores the transaction of specified start_ts. + + + +#### `rules` + +- Specifies the filter rules. For more information, see [Syntax](/table-filter.md#syntax). + + + +#### filter.event-filters + +For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filter-rules). + +##### `matcher` + +- `matcher` is an allow list. `matcher = ["test.worker"]` means this rule only applies to the `worker` table in the `test` database. + +##### `ignore-event` + +- `ignore-event = ["insert"]` ignores `INSERT` events. +- `ignore-event = ["drop table", "delete"]` ignores the `DROP TABLE` DDL events and the `DELETE` DML events. Note that when a value in the clustered index column is updated in TiDB, TiCDC splits an `UPDATE` event into `DELETE` and `INSERT` events. TiCDC cannot identify such events as `UPDATE` events and thus cannot correctly filter out such events. + +##### `ignore-sql` + +- `ignore-sql = ["^drop", "add column"]` ignores DDLs that start with `DROP` or contain `ADD COLUMN`. + +##### `ignore-delete-value-expr` + +- `ignore-delete-value-expr = "name = 'john'"` ignores `DELETE` DMLs that contain the condition `name = 'john'`. + +##### `ignore-insert-value-expr` + +- `ignore-insert-value-expr = "id >= 100"` ignores `INSERT` DMLs that contain the condition `id >= 100` + +##### `ignore-update-old-value-expr` + +- `ignore-update-old-value-expr = "age < 18"` ignores `UPDATE` DMLs whose old value contains `age < 18` + +##### `ignore-update-new-value-expr` + +- `ignore-update-new-value-expr = "gender = 'male'"` ignores `UPDATE` DMLs whose new value contains `gender = 'male'` + +### scheduler + +#### `enable-table-across-nodes` + +- Allocate tables to multiple TiCDC nodes for replication on a per-Region basis. +- This configuration item only takes effect on Kafka changefeeds and is not supported on MySQL changefeeds. +- When `enable-table-across-nodes` is enabled, there are two allocation modes: + + 1. Allocate tables based on the number of Regions, so that each TiCDC node handles roughly the same number of Regions. If the number of Regions for a table exceeds the value of [`region-threshold`](#region-threshold), the table will be allocated to multiple nodes for replication. The default value of `region-threshold` is `100000`. + 2. Allocate tables based on the write traffic, so that each TiCDC node handles roughly the same number of modified rows. Only when the number of modified rows per minute in a table exceeds the value of [`write-key-threshold`](#write-key-threshold), will this allocation take effect. + + You only need to configure one of the two modes. If both `region-threshold` and `write-key-threshold` are configured, TiCDC prioritizes the traffic allocation mode, namely `write-key-threshold`. + +- The value is `false` by default. Set it to `true` to enable this feature. +- Default value: `false` + +#### `region-threshold` + +- Default value: `100000` + +#### `write-key-threshold` + +- Default value: `0`, which means that the traffic allocation mode is not used by default + +### sink + + + +#### `dispatchers` + +- For the sink of MQ type, you can use dispatchers to configure the event dispatcher. +- Starting from v6.1.0, TiDB supports two types of event dispatchers: partition and topic. +- The matching syntax of matcher is the same as the filter rule syntax. +- This configuration item only takes effect if the downstream is MQ. +- When the downstream MQ is Pulsar, if the routing rule for `partition` is not specified as any of `ts`, `index-value`, `table`, or `default`, each Pulsar message will be routed using the string you set as the key. For example, if you specify the routing rule for a matcher as the string `code`, then all Pulsar messages that match that matcher will be routed with `code` as the key. + +#### `column-selectors` New in v7.5.0 + +- Selects specific columns for replication. This only takes effect when the downstream is Kafka. + +#### `protocol` + +- Specifies the protocol format used for encoding messages. +- This configuration item only takes effect if the downstream is Kafka, Pulsar, or a storage service. +- When the downstream is Kafka, the protocol can be canal-json, avro, debezium, open-protocol, or simple. +- When the downstream is Pulsar, the protocol can only be canal-json. +- When the downstream is a storage service, the protocol can only be canal-json or csv. + + + +#### `delete-only-output-handle-key-columns` New in v7.2.0 + +- Specifies the output of DELETE events. This parameter is valid only for canal-json and open-protocol protocols. +- This parameter is incompatible with `force-replicate`. If both this parameter and `force-replicate` are set to `true`, TiCDC reports an error when creating a changefeed. +- The Avro protocol is not controlled by this parameter and always outputs only the primary key columns or unique index columns. +- The CSV protocol is not controlled by this parameter and always outputs all columns. +- Default value: `false`, which means outputting all columns +- When you set it to `true`, only primary key columns or unique index columns are output. + +#### `schema-registry` + +- Specifies the schema registry URL. +- This configuration item only takes effect if the downstream is MQ. + + + +#### `encoder-concurrency` + +- Specifies the number of encoder threads used when encoding data. +- This configuration item only takes effect if the downstream is MQ. +- Default value: `32` + +#### `enable-kafka-sink-v2` + +> **Warning:** +> +> This configuration is an experimental feature. It is not recommended to use it in production environments. + +- Specifies whether to enable kafka-sink-v2 that uses the kafka-go sink library. +- This configuration item only takes effect if the downstream is MQ. +- Default value: `false` + +#### `only-output-updated-columns` New in v7.1.0 + +- Specifies whether to only output the updated columns. +- This configuration item only applies to the MQ downstream using the open-protocol and canal-json. +- Default value: `false` + + + +#### `terminator` + +- This configuration item is only used when you replicate data to storage sinks and can be ignored when replicating data to MQ or MySQL sinks. +- Specifies the row terminator, used for separating two data change events. +- Default value: `""`, which means `\r\n` is used + +#### `date-separator` + +- Specifies the date separator type used in the file directory. For more information, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records). +- This configuration item only takes effect if the downstream is a storage service. +- Default value: `day`, which means separating files by day +- Value options: `none`, `year`, `month`, `day` + +#### `enable-partition-separator` + +- Controls whether to use partitions as the separation string. +- This configuration item only takes effect if the downstream is a storage service. +- Default value: `true`, which means that partitions in a table are stored in separate directories +- Note that this configuration will be deprecated in future versions and will be forcibly set to `true`. It is recommended to keep this configuration at its default value to avoid potential data loss in downstream partitioned tables. For more information, see [Issue #11979](https://github.com/pingcap/tiflow/issues/11979). For usage examples, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records). + +#### `debezium-disable-schema` + +- Controls whether to disable the output of schema information. +- Default value: `false`, which means enabling the output of schema information +- This parameter only takes effect when the sink type is MQ and the output protocol is Debezium. + +#### sink.csv New in v6.5.0 + +Starting from v6.5.0, TiCDC supports saving data changes to storage services in CSV format. Ignore the following configurations if you replicate data to MQ or MySQL sinks. + +##### `delimiter` + +- Specifies the character used to separate fields in the CSV file. The value must be an ASCII character. +- Default value: `,` + +##### `quote` + +- Specifies the quotation character used to surround fields in the CSV file. If the value is empty, no quotation is used. +- Default value: `"` + +##### `null` + +- Specifies the character displayed when a CSV column is NULL. +- Default value: `\N` + +##### `include-commit-ts` + +- Controls whether to include commit-ts in CSV rows. +- Default value: `false` + +##### `binary-encoding-method` + +- Specifies the encoding method of binary data. +- Default value: `base64` +- Value option: `base64`, `hex` + +##### `output-handle-key` + +- Controls whether to output handle key information. This configuration parameter is for internal implementation only, so it is not recommended to set it. +- Default value: `false` + +##### `output-old-value` + +- Controls whether to output the value before the row data changes. The default value is false. +- When it is enabled (setting it to `true`), the `UPDATE` event will output two rows of data: the first row is a `DELETE` event that outputs the data before the change; the second row is an `INSERT` event that outputs the changed data. +- When it is enabled, the `"is-update"` column will be added before the column with data changes. This added column is used to identify whether the data change of the current row comes from the `UPDATE` event or the original `INSERT` or `DELETE` event. If the data change of the current row comes from the `UPDATE` event, the value of the `"is-update"` column is `true`. Otherwise, it is `false`. +- Default value: `false` + +Starting from v8.0.0, TiCDC supports the Simple message encoding protocol. The following are the configuration parameters for the Simple protocol. For more information about the protocol, see [TiCDC Simple Protocol](/ticdc/ticdc-simple-protocol.md). + +The following configuration parameters control the sending behavior of bootstrap messages. + +#### `send-bootstrap-interval-in-sec` + +- Controls the time interval for sending bootstrap messages, in seconds. +- Default value: `120`, which means that a bootstrap message is sent every 120 seconds for each table +- Unit: Seconds + +#### `send-bootstrap-in-msg-count` + +- Controls the message interval for sending bootstrap, in message count. +- Default value: `10000`, which means that a bootstrap message is sent every 10000 row changed messages for each table +- If you want to disable the sending of bootstrap messages, set both [`send-bootstrap-interval-in-sec`](#send-bootstrap-interval-in-sec) and `send-bootstrap-in-msg-count` to `0`. + +#### `send-bootstrap-to-all-partition` + +- Controls whether to send bootstrap messages to all partitions. +- Setting it to `false` means bootstrap messages are sent to only the first partition of the corresponding table topic. +- Default value: `true`, which means that bootstrap messages are sent to all partitions of the corresponding table topic + +#### sink.kafka-config.codec-config + +##### `encoding-format` + +- Controls the encoding format of the Simple protocol messages. Currently, the Simple protocol message supports `json` and `avro` encoding formats. +- Default value: `json` +- Value options: `json`, `avro` + +#### sink.open + +##### `output-old-value` + +- Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "p" field. +- Default value: `true` + +#### sink.debezium + +##### `output-old-value` + +- Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "before" field. +- Default value: `true` + +### consistent + +Specifies the replication consistency configurations for a changefeed when using the redo log. For more information, see [Eventually consistent replication in disaster scenarios](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios). + +Note: The consistency-related configuration items only take effect when the downstream is a database and the redo log feature is enabled. + +#### `level` + +- The data consistency level. `"none"` means that the redo log is disabled. +- Default value: `"none"` +- Value options: `"none"`, `"eventual"` + +#### `max-log-size` + +- The max redo log size. +- Default value: `64` +- Unit: MiB + +#### `flush-interval` + +- The flush interval for redo log. +- Default value: `2000` +- Unit: milliseconds + +#### `storage` + +- The storage URI of the redo log. +- Default value: `""` + +#### `use-file-backend` + +- Specifies whether to store the redo log in a local file. +- Default value: `false` + +#### `encoding-worker-num` + +- The number of encoding and decoding workers in the redo module. +- Default value: `16` + +#### `flush-worker-num` + +- The number of flushing workers in the redo module. +- Default value: `8` + +#### `compression` New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0 + +- The behavior to compress redo log files. +- Default value: `""`, which means no compression +- Value options: `""`, `"lz4"` + +#### `flush-concurrency` New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0 + +- The concurrency for uploading a single redo file. +- Default value: `1`, which means concurrency is disabled + +### integrity + +#### `integrity-check-level` + +- Controls whether to enable the checksum validation for single-row data. +- Default value: `"none"`, which means to disable the feature +- Value options: `"none"`, `"correctness"` + +#### `corruption-handle-level` + +- Specifies the log level of the changefeed when the checksum validation for single-row data fails. +- Default value: `"warn"` +- Value options: `"warn"`, `"error"` + +### sink.kafka-config + +The following configuration items only take effect when the downstream is Kafka. + +#### `sasl-mechanism` + +- Specifies the mechanism of Kafka SASL authentication. +- Default value: `""`, indicating that SASL authentication is not used + + + +#### `sasl-oauth-client-id` + +- Specifies the client-id in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used. +- Default value: `""` + +#### `sasl-oauth-client-secret` + +- Specifies the client-secret in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used. +- Default value: `""` + +#### `sasl-oauth-token-url` + +- Specifies the token-url in the Kafka SASL OAUTHBEARER authentication to obtain the token. This parameter is required when the OAUTHBEARER authentication is used. +- Default value: `""` + +#### `sasl-oauth-scopes` + +- Specifies the scopes in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. +- Default value: `""` + +#### `sasl-oauth-grant-type` + +- Specifies the grant-type in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. +- Default value: `"client_credentials"` + +#### `sasl-oauth-audience` + +- Specifies the audience in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. +- Default value: `""` + + + +#### `output-raw-change-event` + +- Controls whether to output the original data change event. For more information, see [Control whether to split primary or unique key `UPDATE` events](/ticdc/ticdc-split-update-behavior.md#control-whether-to-split-primary-or-unique-key-update-events). +- Default value: `false` + +### sink.kafka-config.glue-schema-registry-config + +The following configuration is only required when using Avro as the protocol and AWS Glue Schema Registry: + +>>>>>>> 1e9dac24d7 (ticdc: fix wrong names in bidirectional replication (#21928)) ```toml # Specifies the memory quota (in bytes) that can be used in the capture server by the sink manager. # If the value is exceeded, the overused part will be recycled by the go runtime. diff --git a/tidb-cloud/recovery-group-overview.md b/tidb-cloud/recovery-group-overview.md index ebcf5a0a2c0c2..eae24576664cd 100644 --- a/tidb-cloud/recovery-group-overview.md +++ b/tidb-cloud/recovery-group-overview.md @@ -25,7 +25,7 @@ A recovery group consists of a set of replicated databases that can be failed ov - Currently, only TiDB Cloud Dedicated clusters hosted on AWS support recovery groups. - Recovery groups are established between two clusters. -- Bi-directional replication of a database is not supported with recovery groups. +- Bidirectional replication of a database is not supported with recovery groups. > **Warning** >