ticdc: fix wrong names in bidirectional replication (#21928) #21960

ti-chi-bot · 2025-10-23T03:06:01Z

This is an automated cherry-pick of #21928

Replaces all instances of 'bi-directional replication' with 'bidirectional replication' across documentation files for consistency. Updates related explanations, configuration references, and limitations to use the unified terminology.

First-time contributors' checklist

I've signed the Contributor License Agreement, which is required for the repository owners to accept my contribution.

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

What is the related PR or file link(s)?

This PR is translated from:ticdc: fix wrong names in bidirectional replication docs-cn#21008
Other reference link(s):

Do your changes match any of the following descriptions?

Delete files
Change aliases
Need modification after applied to another branch
Might cause conflicts after applied to another branch

Signed-off-by: ti-chi-bot <[email protected]>

ti-chi-bot · 2025-10-23T03:06:05Z

@hfxsd This PR has conflicts, I have hold it.
Please resolve them or ask others to resolve them, then comment /unhold to remove the hold label.

ti-chi-bot · 2025-10-23T03:06:06Z

@ti-chi-bot: ## If you want to know how to resolve it, please read the guide in TiDB Dev Guide.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2025-10-23T03:06:16Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign tangenta for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2025-10-23T03:06:21Z

Summary of Changes

Hello @ti-chi-bot, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the consistency and readability of the documentation by updating all instances of 'bi-directional replication' to 'bidirectional replication'. This change ensures a standardized terminology for the TiCDC replication feature across the documentation, making it easier for users to understand and follow the concepts.

Highlights

Terminology Standardization: The pull request standardizes the term 'bi-directional replication' to 'bidirectional replication' across various documentation files for improved consistency and clarity.
Documentation Updates: Updates were applied to release notes, system variable descriptions, TiCDC replication guides, and changefeed configuration documentation to reflect the unified terminology.
Automated Cherry-Pick: This change is an automated cherry-pick, indicating it's a backport of an already approved and merged change from another branch.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request aims to standardize the terminology by replacing 'bi-directional replication' with 'bidirectional replication' across several documentation files. The changes are generally good for consistency.

However, I've found critical issues in two files, ticdc/ticdc-bidirectional-replication.md and ticdc/ticdc-changefeed-config.md, which contain unresolved merge conflict markers. These must be fixed before this PR can be merged.

Additionally, I've pointed out a minor wording issue in ticdc/ticdc-bidirectional-replication.md that could cause confusion and suggested a correction for clarity.

Please address these points to finalize the pull request.

gemini-code-assist · 2025-10-23T03:09:10Z

ticdc/ticdc-bidirectional-replication.md

+<<<<<<< HEAD
 - Usually do not use `AUTO_INCREMENT` or `AUTO_RANDOM` to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bi-directional replication, you can set them as follows:
+=======
+- Usually do not use [`AUTO_INCREMENT`](/auto-increment.md) or [`AUTO_RANDOM`](/auto-random.md) to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bidirectional replication, you can set them as follows:
+>>>>>>> 1e9dac24d7 (ticdc: fix wrong names in bidirectional replication (#21928))


This file contains merge conflict markers (<<<<<<< HEAD, =======, >>>>>>>). These need to be resolved before merging. It seems the intended change was to add links to AUTO_INCREMENT and AUTO_RANDOM and update the terminology. Please resolve the conflict.

Suggested change

<<<<<<< HEAD

- Usually do not use `AUTO_INCREMENT` or `AUTO_RANDOM` to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bi-directional replication, you can set them as follows:

=======

- Usually do not use [`AUTO_INCREMENT`](/auto-increment.md) or [`AUTO_RANDOM`](/auto-random.md) to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bidirectional replication, you can set them as follows:

>>>>>>> 1e9dac24d7 (ticdc: fix wrong names in bidirectional replication (#21928))

- Usually do not use [`AUTO_INCREMENT`](/auto-increment.md) or [`AUTO_RANDOM`](/auto-random.md) to avoid data conflicts in the replicated tables. If you need to use `AUTO_INCREMENT` or `AUTO_RANDOM`, you can set different `auto_increment_increment` and `auto_increment_offset` for different clusters to ensure that different clusters can be assigned different primary keys. For example, if there are three TiDB clusters (A, B, and C) in bidirectional replication, you can set them as follows:

gemini-code-assist · 2025-10-23T03:09:10Z

ticdc/ticdc-changefeed-config.md

+<<<<<<< HEAD
+=======
+### `memory-quota`
+
+- Specifies the memory quota (in bytes) that can be used in the capture server by the sink manager. If the value is exceeded, the overused part will be recycled by the go runtime.
+- Default value: `1073741824` (1 GiB)
+
+### `case-sensitive`
+
+- Specifies whether the database names and tables in the configuration file are case-sensitive. Starting from v6.5.6, v7.1.3, and v7.5.0, the default value changes from `true` to `false`.
+- This configuration item affects configurations related to filter and sink.
+- Default value: `false`
+
+### `force-replicate`
+
+- Specifies whether to forcibly [replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index).
+- Default value: `false`
+
+### `enable-sync-point` <span class="version-mark">New in v6.3.0</span>
+
+- Specifies whether to enable the Syncpoint feature, which is supported starting from v6.3.0 and is disabled by default.
+- Starting from v6.4.0, only the changefeed with the `SYSTEM_VARIABLES_ADMIN` or `SUPER` privilege can use the TiCDC Syncpoint feature.
+- This configuration item only takes effect if the downstream is TiDB.
+- Default value: `false`
+
+### `sync-point-interval`
+
+- Specifies the interval at which Syncpoint aligns the upstream and downstream snapshots.
+- This configuration item only takes effect if the downstream is TiDB.
+- The format is `"h m s"`. For example, `"1h30m30s"`.
+- Default value: `"10m"`
+- Minimum value: `"30s"`
+
+### `sync-point-retention`
+
+- Specifies how long the data is retained by Syncpoint in the downstream table. When this duration is exceeded, the data is cleaned up.
+- This configuration item only takes effect if the downstream is TiDB.
+- The format is `"h m s"`. For example, `"24h30m30s"`.
+- Default value: `"24h"`
+
+### `sql-mode` <span class="version-mark">New in v6.5.6, v7.1.3, and v7.5.0</span>
+
+- Specifies the [SQL mode](/sql-mode.md) used when parsing DDL statements. Multiple modes are separated by commas.
+- Default value: `"ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"`, which is the same as the default SQL mode of TiDB
+
+### `bdr-mode`
+
+- To set up BDR (Bidirectional replication) clusters using TiCDC, modify this parameter to `true` and set the TiDB clusters to BDR mode. For more information, see [Bidirectional Replication](/ticdc/ticdc-bidirectional-replication.md#bidirectional-replication).
+- Default value: `false`, indicating that bidirectional replication (BDR) mode is not enabled
+
+### `changefeed-error-stuck-duration`
+
+- Specifies the duration for which the changefeed is allowed to automatically retry when internal errors or exceptions occur.
+- The changefeed enters the failed state if internal errors or exceptions occur in the changefeed and persist longer than the duration set by this parameter.
+- When the changefeed is in the failed state, you need to restart the changefeed manually for recovery.
+- The format is `"h m s"`. For example, `"1h30m30s"`.
+- Default value: `"30m"`
+
+### mounter
+
+#### `worker-num`
+
+- Specifies the number of threads with which the mounter decodes KV data.
+- Default value: `16`
+
+### filter
+
+#### `ignore-txn-start-ts`
+
+- Ignores the transaction of specified start_ts.
+
+<!-- Example: `[1, 2]` -->
+
+#### `rules`
+
+- Specifies the filter rules. For more information, see [Syntax](/table-filter.md#syntax).
+
+<!-- Example: `['*.*', '!test.*']` -->
+
+#### filter.event-filters
+
+For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filter-rules).
+
+##### `matcher`
+
+- `matcher` is an allow list. `matcher = ["test.worker"]` means this rule only applies to the `worker` table in the `test` database.
+
+##### `ignore-event`
+
+- `ignore-event = ["insert"]` ignores `INSERT` events. 
+- `ignore-event = ["drop table", "delete"]` ignores the `DROP TABLE` DDL events and the `DELETE` DML events. Note that when a value in the clustered index column is updated in TiDB, TiCDC splits an `UPDATE` event into `DELETE` and `INSERT` events. TiCDC cannot identify such events as `UPDATE` events and thus cannot correctly filter out such events.
+
+##### `ignore-sql`
+
+- `ignore-sql = ["^drop", "add column"]` ignores DDLs that start with `DROP` or contain `ADD COLUMN`.
+
+##### `ignore-delete-value-expr`
+
+- `ignore-delete-value-expr = "name = 'john'"` ignores `DELETE` DMLs that contain the condition `name = 'john'`.
+
+##### `ignore-insert-value-expr`
+
+- `ignore-insert-value-expr = "id >= 100"` ignores `INSERT` DMLs that contain the condition `id >= 100`
+
+##### `ignore-update-old-value-expr`
+
+- `ignore-update-old-value-expr = "age < 18"` ignores `UPDATE` DMLs whose old value contains `age < 18`
+
+##### `ignore-update-new-value-expr`
+
+- `ignore-update-new-value-expr = "gender = 'male'"` ignores `UPDATE` DMLs whose new value contains `gender = 'male'`
+
+### scheduler
+
+#### `enable-table-across-nodes`
+
+- Allocate tables to multiple TiCDC nodes for replication on a per-Region basis.
+- This configuration item only takes effect on Kafka changefeeds and is not supported on MySQL changefeeds.
+- When `enable-table-across-nodes` is enabled, there are two allocation modes:
+
+    1. Allocate tables based on the number of Regions, so that each TiCDC node handles roughly the same number of Regions. If the number of Regions for a table exceeds the value of [`region-threshold`](#region-threshold), the table will be allocated to multiple nodes for replication. The default value of `region-threshold` is `100000`.
+    2. Allocate tables based on the write traffic, so that each TiCDC node handles roughly the same number of modified rows. Only when the number of modified rows per minute in a table exceeds the value of [`write-key-threshold`](#write-key-threshold), will this allocation take effect.
+
+  You only need to configure one of the two modes. If both `region-threshold` and `write-key-threshold` are configured, TiCDC prioritizes the traffic allocation mode, namely `write-key-threshold`.
+
+- The value is `false` by default. Set it to `true` to enable this feature.
+- Default value: `false`
+
+#### `region-threshold`
+
+- Default value: `100000`
+
+#### `write-key-threshold`
+
+- Default value: `0`, which means that the traffic allocation mode is not used by default
+
+### sink
+
+<!-- MQ sink configuration items -->
+
+#### `dispatchers`
+
+- For the sink of MQ type, you can use dispatchers to configure the event dispatcher.
+- Starting from v6.1.0, TiDB supports two types of event dispatchers: partition and topic.
+- The matching syntax of matcher is the same as the filter rule syntax.
+- This configuration item only takes effect if the downstream is MQ.
+- When the downstream MQ is Pulsar, if the routing rule for `partition` is not specified as any of `ts`, `index-value`, `table`, or `default`, each Pulsar message will be routed using the string you set as the key. For example, if you specify the routing rule for a matcher as the string `code`, then all Pulsar messages that match that matcher will be routed with `code` as the key.
+
+#### `column-selectors` <span class="version-mark">New in v7.5.0</span>
+
+- Selects specific columns for replication. This only takes effect when the downstream is Kafka.
+
+#### `protocol`
+
+- Specifies the protocol format used for encoding messages.
+- This configuration item only takes effect if the downstream is Kafka, Pulsar, or a storage service.
+- When the downstream is Kafka, the protocol can be canal-json, avro, debezium, open-protocol, or simple.
+- When the downstream is Pulsar, the protocol can only be canal-json.
+- When the downstream is a storage service, the protocol can only be canal-json or csv.
+
+<!-- Example: `"canal-json"` -->
+
+#### `delete-only-output-handle-key-columns` <span class="version-mark">New in v7.2.0</span>
+
+- Specifies the output of DELETE events. This parameter is valid only for canal-json and open-protocol protocols.
+- This parameter is incompatible with `force-replicate`. If both this parameter and `force-replicate` are set to `true`, TiCDC reports an error when creating a changefeed.
+- The Avro protocol is not controlled by this parameter and always outputs only the primary key columns or unique index columns.
+- The CSV protocol is not controlled by this parameter and always outputs all columns.
+- Default value: `false`, which means outputting all columns
+- When you set it to `true`, only primary key columns or unique index columns are output.
+
+#### `schema-registry`
+
+- Specifies the schema registry URL.
+- This configuration item only takes effect if the downstream is MQ.
+
+<!-- Example: `"http://localhost:80801/subjects/{subject-name}/versions/{version-number}/schema"` -->
+
+#### `encoder-concurrency`
+
+- Specifies the number of encoder threads used when encoding data.
+- This configuration item only takes effect if the downstream is MQ.
+- Default value: `32`
+
+#### `enable-kafka-sink-v2`
+
+> **Warning:**
+>
+> This configuration is an experimental feature. It is not recommended to use it in production environments.
+
+- Specifies whether to enable kafka-sink-v2 that uses the kafka-go sink library.
+- This configuration item only takes effect if the downstream is MQ.
+- Default value: `false`
+
+#### `only-output-updated-columns` <span class="version-mark">New in v7.1.0</span>
+
+- Specifies whether to only output the updated columns.
+- This configuration item only applies to the MQ downstream using the open-protocol and canal-json.
+- Default value: `false`
+
+<!-- Storage sink configuration items -->
+
+#### `terminator`
+
+- This configuration item is only used when you replicate data to storage sinks and can be ignored when replicating data to MQ or MySQL sinks.
+- Specifies the row terminator, used for separating two data change events.
+- Default value: `""`, which means `\r\n` is used
+
+#### `date-separator`
+
+- Specifies the date separator type used in the file directory. For more information, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records).
+- This configuration item only takes effect if the downstream is a storage service.
+- Default value: `day`, which means separating files by day
+- Value options: `none`, `year`, `month`, `day`
+
+#### `enable-partition-separator`
+
+- Controls whether to use partitions as the separation string.
+- This configuration item only takes effect if the downstream is a storage service.
+- Default value: `true`, which means that partitions in a table are stored in separate directories
+- Note that this configuration will be deprecated in future versions and will be forcibly set to `true`. It is recommended to keep this configuration at its default value to avoid potential data loss in downstream partitioned tables. For more information, see [Issue #11979](https://github.com/pingcap/tiflow/issues/11979). For usage examples, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records).
+
+#### `debezium-disable-schema`
+
+- Controls whether to disable the output of schema information.
+- Default value: `false`, which means enabling the output of schema information
+- This parameter only takes effect when the sink type is MQ and the output protocol is Debezium.
+
+#### sink.csv <span class="version-mark">New in v6.5.0</span>
+
+Starting from v6.5.0, TiCDC supports saving data changes to storage services in CSV format. Ignore the following configurations if you replicate data to MQ or MySQL sinks.
+
+##### `delimiter`
+
+- Specifies the character used to separate fields in the CSV file. The value must be an ASCII character.
+- Default value: `,`
+
+##### `quote`
+
+- Specifies the quotation character used to surround fields in the CSV file. If the value is empty, no quotation is used.
+- Default value: `"`
+
+##### `null`
+
+- Specifies the character displayed when a CSV column is NULL.
+- Default value: `\N`
+
+##### `include-commit-ts`
+
+- Controls whether to include commit-ts in CSV rows.
+- Default value: `false`
+
+##### `binary-encoding-method`
+
+- Specifies the encoding method of binary data.
+- Default value: `base64`
+- Value option: `base64`, `hex`
+
+##### `output-handle-key`
+
+- Controls whether to output handle key information. This configuration parameter is for internal implementation only, so it is not recommended to set it.
+- Default value: `false`
+
+##### `output-old-value`
+
+- Controls whether to output the value before the row data changes. The default value is false. 
+- When it is enabled (setting it to `true`), the `UPDATE` event will output two rows of data: the first row is a `DELETE` event that outputs the data before the change; the second row is an `INSERT` event that outputs the changed data.
+- When it is enabled, the `"is-update"` column will be added before the column with data changes. This added column is used to identify whether the data change of the current row comes from the `UPDATE` event or the original `INSERT` or `DELETE` event. If the data change of the current row comes from the `UPDATE` event, the value of the `"is-update"` column is `true`. Otherwise, it is `false`.
+- Default value: `false`
+
+Starting from v8.0.0, TiCDC supports the Simple message encoding protocol. The following are the configuration parameters for the Simple protocol. For more information about the protocol, see [TiCDC Simple Protocol](/ticdc/ticdc-simple-protocol.md).
+
+The following configuration parameters control the sending behavior of bootstrap messages.
+
+#### `send-bootstrap-interval-in-sec`
+
+- Controls the time interval for sending bootstrap messages, in seconds.
+- Default value: `120`, which means that a bootstrap message is sent every 120 seconds for each table
+- Unit: Seconds
+
+#### `send-bootstrap-in-msg-count`
+
+- Controls the message interval for sending bootstrap, in message count.
+- Default value: `10000`, which means that a bootstrap message is sent every 10000 row changed messages for each table
+- If you want to disable the sending of bootstrap messages, set both [`send-bootstrap-interval-in-sec`](#send-bootstrap-interval-in-sec) and `send-bootstrap-in-msg-count` to `0`.
+
+#### `send-bootstrap-to-all-partition`
+
+- Controls whether to send bootstrap messages to all partitions.
+- Setting it to `false` means bootstrap messages are sent to only the first partition of the corresponding table topic.
+- Default value: `true`, which means that bootstrap messages are sent to all partitions of the corresponding table topic
+
+#### sink.kafka-config.codec-config
+
+##### `encoding-format`
+
+- Controls the encoding format of the Simple protocol messages. Currently, the Simple protocol message supports `json` and `avro` encoding formats.
+- Default value: `json`
+- Value options: `json`, `avro`
+
+#### sink.open
+
+##### `output-old-value`
+
+- Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "p" field.
+- Default value: `true`
+
+#### sink.debezium
+
+##### `output-old-value`
+
+- Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "before" field.
+- Default value: `true`
+
+### consistent
+
+Specifies the replication consistency configurations for a changefeed when using the redo log. For more information, see [Eventually consistent replication in disaster scenarios](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios).
+
+Note: The consistency-related configuration items only take effect when the downstream is a database and the redo log feature is enabled.
+
+#### `level`
+
+- The data consistency level. `"none"` means that the redo log is disabled.
+- Default value: `"none"`
+- Value options: `"none"`, `"eventual"`
+
+#### `max-log-size`
+
+- The max redo log size.
+- Default value: `64`
+- Unit: MiB
+
+#### `flush-interval`
+
+- The flush interval for redo log.
+- Default value: `2000`
+- Unit: milliseconds
+
+#### `storage`
+
+- The storage URI of the redo log.
+- Default value: `""`
+
+#### `use-file-backend`
+
+- Specifies whether to store the redo log in a local file.
+- Default value: `false`
+
+#### `encoding-worker-num`
+
+- The number of encoding and decoding workers in the redo module.
+- Default value: `16`
+
+#### `flush-worker-num`
+
+- The number of flushing workers in the redo module.
+- Default value: `8`
+
+#### `compression` <span class="version-mark">New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0</span>
+
+- The behavior to compress redo log files.
+- Default value: `""`, which means no compression
+- Value options: `""`, `"lz4"`
+
+#### `flush-concurrency` <span class="version-mark">New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0</span>
+
+- The concurrency for uploading a single redo file.
+- Default value: `1`, which means concurrency is disabled
+
+### integrity
+
+#### `integrity-check-level`
+
+- Controls whether to enable the checksum validation for single-row data.
+- Default value: `"none"`, which means to disable the feature
+- Value options: `"none"`, `"correctness"`
+
+#### `corruption-handle-level`
+
+- Specifies the log level of the changefeed when the checksum validation for single-row data fails.
+- Default value: `"warn"` 
+- Value options: `"warn"`, `"error"`
+
+### sink.kafka-config
+
+The following configuration items only take effect when the downstream is Kafka.
+
+#### `sasl-mechanism`
+
+- Specifies the mechanism of Kafka SASL authentication.
+- Default value: `""`, indicating that SASL authentication is not used
+
+<!-- Example: `OAUTHBEARER` -->
+
+#### `sasl-oauth-client-id`
+
+- Specifies the client-id in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used.
+- Default value: `""`
+
+#### `sasl-oauth-client-secret`
+
+- Specifies the client-secret in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used.
+- Default value: `""`
+
+#### `sasl-oauth-token-url`
+
+- Specifies the token-url in the Kafka SASL OAUTHBEARER authentication to obtain the token. This parameter is required when the OAUTHBEARER authentication is used.
+- Default value: `""`
+
+#### `sasl-oauth-scopes`
+
+- Specifies the scopes in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used.
+- Default value: `""`
+
+#### `sasl-oauth-grant-type`
+
+- Specifies the grant-type in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used.
+- Default value: `"client_credentials"`
+
+#### `sasl-oauth-audience`
+
+- Specifies the audience in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used.
+- Default value: `""`
+
+<!-- Example: `"kafka"` -->
+
+#### `output-raw-change-event`
+
+- Controls whether to output the original data change event. For more information, see [Control whether to split primary or unique key `UPDATE` events](/ticdc/ticdc-split-update-behavior.md#control-whether-to-split-primary-or-unique-key-update-events).
+- Default value: `false`
+
+### sink.kafka-config.glue-schema-registry-config
+
+The following configuration is only required when using Avro as the protocol and AWS Glue Schema Registry:
+
+>>>>>>> 1e9dac24d7 (ticdc: fix wrong names in bidirectional replication (#21928))


This file contains merge conflict markers (<<<<<<< HEAD, =======, >>>>>>>). These need to be resolved before merging. It appears a large block of content was intended to be added here. Please resolve the conflict by removing the markers and keeping the intended content.

### `memory-quota` - Specifies the memory quota (in bytes) that can be used in the capture server by the sink manager. If the value is exceeded, the overused part will be recycled by the go runtime. - Default value: `1073741824` (1 GiB) ### `case-sensitive` - Specifies whether the database names and tables in the configuration file are case-sensitive. Starting from v6.5.6, v7.1.3, and v7.5.0, the default value changes from `true` to `false`. - This configuration item affects configurations related to filter and sink. - Default value: `false` ### `force-replicate` - Specifies whether to forcibly [replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index). - Default value: `false` ### `enable-sync-point` New in v6.3.0 - Specifies whether to enable the Syncpoint feature, which is supported starting from v6.3.0 and is disabled by default. - Starting from v6.4.0, only the changefeed with the `SYSTEM_VARIABLES_ADMIN` or `SUPER` privilege can use the TiCDC Syncpoint feature. - This configuration item only takes effect if the downstream is TiDB. - Default value: `false` ### `sync-point-interval` - Specifies the interval at which Syncpoint aligns the upstream and downstream snapshots. - This configuration item only takes effect if the downstream is TiDB. - The format is `"h m s"`. For example, `"1h30m30s"`. - Default value: `"10m"` - Minimum value: `"30s"` ### `sync-point-retention` - Specifies how long the data is retained by Syncpoint in the downstream table. When this duration is exceeded, the data is cleaned up. - This configuration item only takes effect if the downstream is TiDB. - The format is `"h m s"`. For example, `"24h30m30s"`. - Default value: `"24h"` ### `sql-mode` New in v6.5.6, v7.1.3, and v7.5.0 - Specifies the [SQL mode](/sql-mode.md) used when parsing DDL statements. Multiple modes are separated by commas. - Default value: `"ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"`, which is the same as the default SQL mode of TiDB ### `bdr-mode` - To set up BDR (Bidirectional replication) clusters using TiCDC, modify this parameter to `true` and set the TiDB clusters to BDR mode. For more information, see [Bidirectional Replication](/ticdc/ticdc-bidirectional-replication.md#bidirectional-replication). - Default value: `false`, indicating that bidirectional replication (BDR) mode is not enabled ### `changefeed-error-stuck-duration` - Specifies the duration for which the changefeed is allowed to automatically retry when internal errors or exceptions occur. - The changefeed enters the failed state if internal errors or exceptions occur in the changefeed and persist longer than the duration set by this parameter. - When the changefeed is in the failed state, you need to restart the changefeed manually for recovery. - The format is `"h m s"`. For example, `"1h30m30s"`. - Default value: `"30m"` ### mounter #### `worker-num` - Specifies the number of threads with which the mounter decodes KV data. - Default value: `16` ### filter #### `ignore-txn-start-ts` - Ignores the transaction of specified start_ts.  #### `rules` - Specifies the filter rules. For more information, see [Syntax](/table-filter.md#syntax).  #### filter.event-filters For more information, see [Event filter rules](/ticdc/ticdc-filter.md#event-filter-rules). ##### `matcher` - `matcher` is an allow list. `matcher = ["test.worker"]` means this rule only applies to the `worker` table in the `test` database. ##### `ignore-event` - `ignore-event = ["insert"]` ignores `INSERT` events. - `ignore-event = ["drop table", "delete"]` ignores the `DROP TABLE` DDL events and the `DELETE` DML events. Note that when a value in the clustered index column is updated in TiDB, TiCDC splits an `UPDATE` event into `DELETE` and `INSERT` events. TiCDC cannot identify such events as `UPDATE` events and thus cannot correctly filter out such events. ##### `ignore-sql` - `ignore-sql = ["^drop", "add column"]` ignores DDLs that start with `DROP` or contain `ADD COLUMN`. ##### `ignore-delete-value-expr` - `ignore-delete-value-expr = "name = 'john'"` ignores `DELETE` DMLs that contain the condition `name = 'john'`. ##### `ignore-insert-value-expr` - `ignore-insert-value-expr = "id >= 100"` ignores `INSERT` DMLs that contain the condition `id >= 100` ##### `ignore-update-old-value-expr` - `ignore-update-old-value-expr = "age < 18"` ignores `UPDATE` DMLs whose old value contains `age < 18` ##### `ignore-update-new-value-expr` - `ignore-update-new-value-expr = "gender = 'male'"` ignores `UPDATE` DMLs whose new value contains `gender = 'male'` ### scheduler #### `enable-table-across-nodes` - Allocate tables to multiple TiCDC nodes for replication on a per-Region basis. - This configuration item only takes effect on Kafka changefeeds and is not supported on MySQL changefeeds. - When `enable-table-across-nodes` is enabled, there are two allocation modes: 1. Allocate tables based on the number of Regions, so that each TiCDC node handles roughly the same number of Regions. If the number of Regions for a table exceeds the value of [`region-threshold`](#region-threshold), the table will be allocated to multiple nodes for replication. The default value of `region-threshold` is `100000`. 2. Allocate tables based on the write traffic, so that each TiCDC node handles roughly the same number of modified rows. Only when the number of modified rows per minute in a table exceeds the value of [`write-key-threshold`](#write-key-threshold), will this allocation take effect. You only need to configure one of the two modes. If both `region-threshold` and `write-key-threshold` are configured, TiCDC prioritizes the traffic allocation mode, namely `write-key-threshold`. - The value is `false` by default. Set it to `true` to enable this feature. - Default value: `false` #### `region-threshold` - Default value: `100000` #### `write-key-threshold` - Default value: `0`, which means that the traffic allocation mode is not used by default ### sink  #### `dispatchers` - For the sink of MQ type, you can use dispatchers to configure the event dispatcher. - Starting from v6.1.0, TiDB supports two types of event dispatchers: partition and topic. - The matching syntax of matcher is the same as the filter rule syntax. - This configuration item only takes effect if the downstream is MQ. - When the downstream MQ is Pulsar, if the routing rule for `partition` is not specified as any of `ts`, `index-value`, `table`, or `default`, each Pulsar message will be routed using the string you set as the key. For example, if you specify the routing rule for a matcher as the string `code`, then all Pulsar messages that match that matcher will be routed with `code` as the key. #### `column-selectors` New in v7.5.0 - Selects specific columns for replication. This only takes effect when the downstream is Kafka. #### `protocol` - Specifies the protocol format used for encoding messages. - This configuration item only takes effect if the downstream is Kafka, Pulsar, or a storage service. - When the downstream is Kafka, the protocol can be canal-json, avro, debezium, open-protocol, or simple. - When the downstream is Pulsar, the protocol can only be canal-json. - When the downstream is a storage service, the protocol can only be canal-json or csv.  #### `delete-only-output-handle-key-columns` New in v7.2.0 - Specifies the output of DELETE events. This parameter is valid only for canal-json and open-protocol protocols. - This parameter is incompatible with `force-replicate`. If both this parameter and `force-replicate` are set to `true`, TiCDC reports an error when creating a changefeed. - The Avro protocol is not controlled by this parameter and always outputs only the primary key columns or unique index columns. - The CSV protocol is not controlled by this parameter and always outputs all columns. - Default value: `false`, which means outputting all columns - When you set it to `true`, only primary key columns or unique index columns are output. #### `schema-registry` - Specifies the schema registry URL. - This configuration item only takes effect if the downstream is MQ.  #### `encoder-concurrency` - Specifies the number of encoder threads used when encoding data. - This configuration item only takes effect if the downstream is MQ. - Default value: `32` #### `enable-kafka-sink-v2` > **Warning:** > > This configuration is an experimental feature. It is not recommended to use it in production environments. - Specifies whether to enable kafka-sink-v2 that uses the kafka-go sink library. - This configuration item only takes effect if the downstream is MQ. - Default value: `false` #### `only-output-updated-columns` New in v7.1.0 - Specifies whether to only output the updated columns. - This configuration item only applies to the MQ downstream using the open-protocol and canal-json. - Default value: `false`  #### `terminator` - This configuration item is only used when you replicate data to storage sinks and can be ignored when replicating data to MQ or MySQL sinks. - Specifies the row terminator, used for separating two data change events. - Default value: `""`, which means `\r\n` is used #### `date-separator` - Specifies the date separator type used in the file directory. For more information, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records). - This configuration item only takes effect if the downstream is a storage service. - Default value: `day`, which means separating files by day - Value options: `none`, `year`, `month`, `day` #### `enable-partition-separator` - Controls whether to use partitions as the separation string. - This configuration item only takes effect if the downstream is a storage service. - Default value: `true`, which means that partitions in a table are stored in separate directories - Note that this configuration will be deprecated in future versions and will be forcibly set to `true`. It is recommended to keep this configuration at its default value to avoid potential data loss in downstream partitioned tables. For more information, see [Issue #11979](https://github.com/pingcap/tiflow/issues/11979). For usage examples, see [Data change records](/ticdc/ticdc-sink-to-cloud-storage.md#data-change-records). #### `debezium-disable-schema` - Controls whether to disable the output of schema information. - Default value: `false`, which means enabling the output of schema information - This parameter only takes effect when the sink type is MQ and the output protocol is Debezium. #### sink.csv New in v6.5.0 Starting from v6.5.0, TiCDC supports saving data changes to storage services in CSV format. Ignore the following configurations if you replicate data to MQ or MySQL sinks. ##### `delimiter` - Specifies the character used to separate fields in the CSV file. The value must be an ASCII character. - Default value: `,` ##### `quote` - Specifies the quotation character used to surround fields in the CSV file. If the value is empty, no quotation is used. - Default value: `"` ##### `null` - Specifies the character displayed when a CSV column is NULL. - Default value: `\N` ##### `include-commit-ts` - Controls whether to include commit-ts in CSV rows. - Default value: `false` ##### `binary-encoding-method` - Specifies the encoding method of binary data. - Default value: `base64` - Value option: `base64`, `hex` ##### `output-handle-key` - Controls whether to output handle key information. This configuration parameter is for internal implementation only, so it is not recommended to set it. - Default value: `false` ##### `output-old-value` - Controls whether to output the value before the row data changes. The default value is false. - When it is enabled (setting it to `true`), the `UPDATE` event will output two rows of data: the first row is a `DELETE` event that outputs the data before the change; the second row is an `INSERT` event that outputs the changed data. - When it is enabled, the `"is-update"` column will be added before the column with data changes. This added column is used to identify whether the data change of the current row comes from the `UPDATE` event or the original `INSERT` or `DELETE` event. If the data change of the current row comes from the `UPDATE` event, the value of the `"is-update"` column is `true`. Otherwise, it is `false`. - Default value: `false` Starting from v8.0.0, TiCDC supports the Simple message encoding protocol. The following are the configuration parameters for the Simple protocol. For more information about the protocol, see [TiCDC Simple Protocol](/ticdc/ticdc-simple-protocol.md). The following configuration parameters control the sending behavior of bootstrap messages. #### `send-bootstrap-interval-in-sec` - Controls the time interval for sending bootstrap messages, in seconds. - Default value: `120`, which means that a bootstrap message is sent every 120 seconds for each table - Unit: Seconds #### `send-bootstrap-in-msg-count` - Controls the message interval for sending bootstrap, in message count. - Default value: `10000`, which means that a bootstrap message is sent every 10000 row changed messages for each table - If you want to disable the sending of bootstrap messages, set both [`send-bootstrap-interval-in-sec`](#send-bootstrap-interval-in-sec) and `send-bootstrap-in-msg-count` to `0`. #### `send-bootstrap-to-all-partition` - Controls whether to send bootstrap messages to all partitions. - Setting it to `false` means bootstrap messages are sent to only the first partition of the corresponding table topic. - Default value: `true`, which means that bootstrap messages are sent to all partitions of the corresponding table topic #### sink.kafka-config.codec-config ##### `encoding-format` - Controls the encoding format of the Simple protocol messages. Currently, the Simple protocol message supports `json` and `avro` encoding formats. - Default value: `json` - Value options: `json`, `avro` #### sink.open ##### `output-old-value` - Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "p" field. - Default value: `true` #### sink.debezium ##### `output-old-value` - Controls whether to output the value before the row data changes. The default value is true. When it is disabled, the `UPDATE` event does not output the "before" field. - Default value: `true` ### consistent Specifies the replication consistency configurations for a changefeed when using the redo log. For more information, see [Eventually consistent replication in disaster scenarios](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios). Note: The consistency-related configuration items only take effect when the downstream is a database and the redo log feature is enabled. #### `level` - The data consistency level. `"none"` means that the redo log is disabled. - Default value: `"none"` - Value options: `"none"`, `"eventual"` #### `max-log-size` - The max redo log size. - Default value: `64` - Unit: MiB #### `flush-interval` - The flush interval for redo log. - Default value: `2000` - Unit: milliseconds #### `storage` - The storage URI of the redo log. - Default value: `""` #### `use-file-backend` - Specifies whether to store the redo log in a local file. - Default value: `false` #### `encoding-worker-num` - The number of encoding and decoding workers in the redo module. - Default value: `16` #### `flush-worker-num` - The number of flushing workers in the redo module. - Default value: `8` #### `compression` New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0 - The behavior to compress redo log files. - Default value: `""`, which means no compression - Value options: `""`, `"lz4"` #### `flush-concurrency` New in v6.5.6, v7.1.3, v7.5.1, and v7.6.0 - The concurrency for uploading a single redo file. - Default value: `1`, which means concurrency is disabled ### integrity #### `integrity-check-level` - Controls whether to enable the checksum validation for single-row data. - Default value: `"none"`, which means to disable the feature - Value options: `"none"`, `"correctness"` #### `corruption-handle-level` - Specifies the log level of the changefeed when the checksum validation for single-row data fails. - Default value: `"warn"` - Value options: `"warn"`, `"error"` ### sink.kafka-config The following configuration items only take effect when the downstream is Kafka. #### `sasl-mechanism` - Specifies the mechanism of Kafka SASL authentication. - Default value: `""`, indicating that SASL authentication is not used  #### `sasl-oauth-client-id` - Specifies the client-id in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used. - Default value: `""` #### `sasl-oauth-client-secret` - Specifies the client-secret in the Kafka SASL OAUTHBEARER authentication. This parameter is required when the OAUTHBEARER authentication is used. - Default value: `""` #### `sasl-oauth-token-url` - Specifies the token-url in the Kafka SASL OAUTHBEARER authentication to obtain the token. This parameter is required when the OAUTHBEARER authentication is used. - Default value: `""` #### `sasl-oauth-scopes` - Specifies the scopes in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. - Default value: `""` #### `sasl-oauth-grant-type` - Specifies the grant-type in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. - Default value: `"client_credentials"` #### `sasl-oauth-audience` - Specifies the audience in the Kafka SASL OAUTHBEARER authentication. This parameter is optional when the OAUTHBEARER authentication is used. - Default value: `""`  #### `output-raw-change-event` - Controls whether to output the original data change event. For more information, see [Control whether to split primary or unique key `UPDATE` events](/ticdc/ticdc-split-update-behavior.md#control-whether-to-split-primary-or-unique-key-update-events). - Default value: `false` ### sink.kafka-config.glue-schema-registry-config The following configuration is only required when using Avro as the protocol and AWS Glue Schema Registry:

gemini-code-assist · 2025-10-23T03:09:10Z

ticdc/ticdc-bidirectional-replication.md

+    1. Check the time point of the upstream and downstream clusters. In the case of two TiDB clusters, make sure that data in the two clusters are consistent at certain time points. For example, the data of TiDB 1 at `ts=1` and the data of TiDB 2 at `ts=2` are consistent.

-    2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB A, set `--start-ts=1`; if the upstream cluster is TiDB B, set `--start-ts=2`.
+    2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the downstream cluster is TiDB 2, set `--start-ts=2`.


The description for setting --start-ts for the second cluster is a bit confusing. It says if the downstream cluster is TiDB 2, set --start-ts=2. For clarity and consistency, it would be better to refer to it as the upstream of the second changefeed. I suggest changing it to if the upstream cluster is TiDB 2, set --start-ts=2.

Suggested change

2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the downstream cluster is TiDB 2, set `--start-ts=2`.

2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the upstream cluster is TiDB 2, set `--start-ts=2`.

ti-chi-bot · 2025-10-23T03:09:14Z

@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-verify	`6d8d778`	link	true	`/test pull-verify`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

This is an automated cherry-pick of pingcap#21928

6d8d778

Signed-off-by: ti-chi-bot <[email protected]>

ti-chi-bot mentioned this pull request Oct 23, 2025

ticdc: fix wrong names in bidirectional replication #21928

Merged

14 tasks

ti-chi-bot assigned hfxsd Oct 23, 2025

ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 23, 2025

gemini-code-assist bot reviewed Oct 23, 2025

View reviewed changes

hfxsd closed this Oct 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ticdc: fix wrong names in bidirectional replication (#21928) #21960

ticdc: fix wrong names in bidirectional replication (#21928) #21960

Uh oh!

ti-chi-bot commented Oct 23, 2025

Uh oh!

ti-chi-bot commented Oct 23, 2025

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

gemini-code-assist bot commented Oct 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 23, 2025

Uh oh!

gemini-code-assist bot Oct 23, 2025

Uh oh!

gemini-code-assist bot Oct 23, 2025

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the downstream cluster is TiDB 2, set `--start-ts=2`.
	2. When you create the changefeed, set the `--start-ts` of the changefeed for the upstream cluster to the corresponding `tso`. That is, if the upstream cluster is TiDB 1, set `--start-ts=1`; if the upstream cluster is TiDB 2, set `--start-ts=2`.

ticdc: fix wrong names in bidirectional replication (#21928) #21960

ticdc: fix wrong names in bidirectional replication (#21928) #21960

Uh oh!

Conversation

ti-chi-bot commented Oct 23, 2025

First-time contributors' checklist

What is changed, added or deleted? (Required)

Which TiDB version(s) do your changes apply to? (Required)

What is the related PR or file link(s)?

Do your changes match any of the following descriptions?

Uh oh!

ti-chi-bot commented Oct 23, 2025

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

gemini-code-assist bot commented Oct 23, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

ti-chi-bot bot commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants