Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the resolved description for clarity on duration between timestamps #19357

Merged
merged 5 commits into from
Feb 11, 2025

Conversation

kathancox
Copy link
Contributor

Fixes DOC-12082

@kathancox kathancox force-pushed the resolved-option-description branch from 075d0ee to 76e4357 Compare February 7, 2025 15:06
Copy link

netlify bot commented Feb 7, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit 075d0ee
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/67a62176f0541000085ebac3

Copy link

netlify bot commented Feb 7, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit 075d0ee
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-api-docs/deploys/67a6217691316d0009aa7544

Copy link

netlify bot commented Feb 7, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit 2fa6547
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-interactivetutorials-docs/deploys/67abb58e83a21b00084197be

Copy link

netlify bot commented Feb 7, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit 2fa6547
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-api-docs/deploys/67abb58e77c7920008cb72c5

Copy link

netlify bot commented Feb 7, 2025

Netlify Preview

Name Link
🔨 Latest commit 075d0ee
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-docs/deploys/67a62176a131c00008542d6b
😎 Deploy Preview https://deploy-preview-19357--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

netlify bot commented Feb 7, 2025

Netlify Preview

Name Link
🔨 Latest commit 2fa6547
🔍 Latest deploy log https://app.netlify.com/sites/cockroachdb-docs/deploys/67abb58ea9849e0008274852
😎 Deploy Preview https://deploy-preview-19357--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kathancox kathancox requested a review from andyyang890 February 7, 2025 21:38
Copy link

@andyyang890 andyyang890 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting up this change! Left a few comments.

src/current/v25.1/create-changefeed.md Outdated Show resolved Hide resolved
src/current/v25.1/changefeed-messages.md Outdated Show resolved Hide resolved
src/current/v25.1/changefeed-messages.md Outdated Show resolved Hide resolved
src/current/v25.1/changefeed-for.md Outdated Show resolved Hide resolved
@@ -71,7 +71,7 @@ Option | Value | Description
<a name="initial-scan"></a>`initial_scan` / `no_initial_scan` / `initial_scan_only` | N/A | Control whether or not an initial scan will occur at the start time of a changefeed. `initial_scan_only` will perform an initial scan and then the changefeed job will complete with a `successful` status. You cannot use [`end_time`](#end-time) and `initial_scan_only` simultaneously.<br><br>If none of these options are specified, an initial scan will occur if there is no [`cursor`](#cursor-option), and will not occur if there is one. This preserves the behavior from previous releases. <br><br>You cannot specify `initial_scan` and `no_initial_scan` or `no_initial_scan and` `initial_scan_only` simultaneously.<br><br>Default: `initial_scan` <br>If used in conjunction with `cursor`, an initial scan will be performed at the cursor timestamp. If no `cursor` is specified, the initial scan is performed at `now()`.
<a name="min-checkpoint-frequency"></a>`min_checkpoint_frequency` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Controls how often nodes flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). Changefeeds will wait for at least the specified duration before a flushing. This can help you control the flush frequency to achieve better throughput. If this is set to `0s`, a node will flush as long as the high-water mark has increased for the ranges that particular node is processing. If a changefeed is resumed, then `min_checkpoint_frequency` is the amount of time that changefeed will need to catch up. That is, it could emit duplicate messages during this time. <br><br>**Note:** [`resolved`](#resolved-option) messages will not be emitted more frequently than the configured `min_checkpoint_frequency` (but may be emitted less frequently). Since `min_checkpoint_frequency` defaults to `30s`, you **must** configure `min_checkpoint_frequency` to at least the desired `resolved` message frequency if you require `resolved` messages more frequently than `30s`.<br><br>**Default:** `30s`
`mvcc_timestamp` | N/A | Include the [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp for each emitted row in a changefeed. With the `mvcc_timestamp` option, each emitted row will always contain its MVCC timestamp, even during the changefeed's initial backfill.
<a name="resolved-option"></a>`resolved` | [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. <br><br>Set an optional minimal duration between emitting resolved timestamps. Example: `resolved='10s'`. This option will emit a resolved timestamp event **only** if the timestamp has advanced and at least the optional duration has elapsed. If no duration is specified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
<a name="resolved-option"></a>`resolved` | [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. <br><br>Set a minimum amount of time between the last emitted resolved timestamp and the new resolved timestamp. Example: `resolved='10s'`. This option will emit a resolved timestamp event **only** if the timestamp has advanced and at least the optional duration has elapsed. If no duration is specified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because resolved messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also be [Duration string](https://pkg.go.dev/time#ParseDuration) like in create-changefeed.md? Also, is there any reason this text isn't identical to the one in that file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to duration string, which was missed on the EXP CF FOR page during some earlier update to the resolved option description.

Please let me know if this should be adjusted, but the main two points that I see are different are: "connected to a sink" and referring to the changefeed as a job. Does EXPERIMENTAL CHANGEFEED FOR operate as a job, I thought not, or is that incorrect?

Copy link

@andyyang890 andyyang890 Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I was thinking the sentence Resolved timestamps do not emit until all ranges in the changefeed have progressed to a specific point in time. could be used for both types of changefeeds but I'm ok with being explicit about enterprise changefeeds being jobs-based.

Maybe instead of

Resolved timestamps do not emit until the changefeed job has reached a checkpoint.

we should say

Resolved timestamps do not emit until all ranges in the changefeed have progressed to a specific point in time and the progress has been checkpointed.

Edit: edited suggestion

Copy link

@andyyang890 andyyang890 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo one comment

@@ -71,7 +71,7 @@ Option | Value | Description
<a name="initial-scan"></a>`initial_scan` / `no_initial_scan` / `initial_scan_only` | N/A | Control whether or not an initial scan will occur at the start time of a changefeed. `initial_scan_only` will perform an initial scan and then the changefeed job will complete with a `successful` status. You cannot use [`end_time`](#end-time) and `initial_scan_only` simultaneously.<br><br>If none of these options are specified, an initial scan will occur if there is no [`cursor`](#cursor-option), and will not occur if there is one. This preserves the behavior from previous releases. <br><br>You cannot specify `initial_scan` and `no_initial_scan` or `no_initial_scan and` `initial_scan_only` simultaneously.<br><br>Default: `initial_scan` <br>If used in conjunction with `cursor`, an initial scan will be performed at the cursor timestamp. If no `cursor` is specified, the initial scan is performed at `now()`.
<a name="min-checkpoint-frequency"></a>`min_checkpoint_frequency` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Controls how often nodes flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). Changefeeds will wait for at least the specified duration before a flushing. This can help you control the flush frequency to achieve better throughput. If this is set to `0s`, a node will flush as long as the high-water mark has increased for the ranges that particular node is processing. If a changefeed is resumed, then `min_checkpoint_frequency` is the amount of time that changefeed will need to catch up. That is, it could emit duplicate messages during this time. <br><br>**Note:** [`resolved`](#resolved-option) messages will not be emitted more frequently than the configured `min_checkpoint_frequency` (but may be emitted less frequently). Since `min_checkpoint_frequency` defaults to `30s`, you **must** configure `min_checkpoint_frequency` to at least the desired `resolved` message frequency if you require `resolved` messages more frequently than `30s`.<br><br>**Default:** `30s`
`mvcc_timestamp` | N/A | Include the [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp for each emitted row in a changefeed. With the `mvcc_timestamp` option, each emitted row will always contain its MVCC timestamp, even during the changefeed's initial backfill.
<a name="resolved-option"></a>`resolved` | [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. <br><br>Set an optional minimal duration between emitting resolved timestamps. Example: `resolved='10s'`. This option will emit a resolved timestamp event **only** if the timestamp has advanced and at least the optional duration has elapsed. If no duration is specified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
<a name="resolved-option"></a>`resolved` | [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. <br><br>Set a minimum amount of time between the last emitted resolved timestamp and the new resolved timestamp. Example: `resolved='10s'`. This option will emit a resolved timestamp event **only** if the timestamp has advanced and at least the optional duration has elapsed. If no duration is specified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because resolved messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
Copy link

@andyyang890 andyyang890 Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, I was thinking the sentence Resolved timestamps do not emit until all ranges in the changefeed have progressed to a specific point in time. could be used for both types of changefeeds but I'm ok with being explicit about enterprise changefeeds being jobs-based.

Maybe instead of

Resolved timestamps do not emit until the changefeed job has reached a checkpoint.

we should say

Resolved timestamps do not emit until all ranges in the changefeed have progressed to a specific point in time and the progress has been checkpointed.

Edit: edited suggestion

Copy link
Contributor

@rmloveland rmloveland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -71,7 +71,7 @@ Option | Value | Description
<a name="initial-scan"></a>`initial_scan` / `no_initial_scan` / `initial_scan_only` | N/A | Control whether or not an initial scan will occur at the start time of a changefeed. `initial_scan_only` will perform an initial scan and then the changefeed job will complete with a `successful` status. You cannot use [`end_time`](#end-time) and `initial_scan_only` simultaneously.<br><br>If none of these options are specified, an initial scan will occur if there is no [`cursor`](#cursor-option), and will not occur if there is one. This preserves the behavior from previous releases. <br><br>You cannot specify `initial_scan` and `no_initial_scan` or `no_initial_scan and` `initial_scan_only` simultaneously.<br><br>Default: `initial_scan` <br>If used in conjunction with `cursor`, an initial scan will be performed at the cursor timestamp. If no `cursor` is specified, the initial scan is performed at `now()`.
<a name="min-checkpoint-frequency"></a>`min_checkpoint_frequency` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Controls how often nodes flush their progress to the [coordinating changefeed node]({% link {{ page.version.version }}/how-does-an-enterprise-changefeed-work.md %}). Changefeeds will wait for at least the specified duration before a flushing. This can help you control the flush frequency to achieve better throughput. If this is set to `0s`, a node will flush as long as the high-water mark has increased for the ranges that particular node is processing. If a changefeed is resumed, then `min_checkpoint_frequency` is the amount of time that changefeed will need to catch up. That is, it could emit duplicate messages during this time. <br><br>**Note:** [`resolved`](#resolved-option) messages will not be emitted more frequently than the configured `min_checkpoint_frequency` (but may be emitted less frequently). Since `min_checkpoint_frequency` defaults to `30s`, you **must** configure `min_checkpoint_frequency` to at least the desired `resolved` message frequency if you require `resolved` messages more frequently than `30s`.<br><br>**Default:** `30s`
`mvcc_timestamp` | N/A | Include the [MVCC]({% link {{ page.version.version }}/architecture/storage-layer.md %}#mvcc) timestamp for each emitted row in a changefeed. With the `mvcc_timestamp` option, each emitted row will always contain its MVCC timestamp, even during the changefeed's initial backfill.
<a name="resolved-option"></a>`resolved` | [`INTERVAL`]({% link {{ page.version.version }}/interval.md %}) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events for the changefeed. Resolved timestamp events do not emit until all ranges in the changefeed have progressed to a specific point in time. <br><br>Set an optional minimal duration between emitting resolved timestamps. Example: `resolved='10s'`. This option will emit a resolved timestamp event **only** if the timestamp has advanced and at least the optional duration has elapsed. If no duration is specified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
<a name="resolved-option"></a>`resolved` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Emit [resolved timestamps]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for the changefeed. Resolved timestamps do not emit until all ranges in the changefeed have progressed to a specific point in time.<br><br>Set a minimum amount of time that the changefeed's high-water mark (overall resolved timestamp) must advance by before another resolved timestamp is emitted. Example: `resolved='10s'`. This option will **only** emit a resolved timestamp if the timestamp has advanced (and by at least the optional duration, if set). If a duration is unspecified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set the [`min_checkpoint_frequency`](#min-checkpoint-frequency) option to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see that we are changing the 'value' column from the SQL INTERVAL type which is meaningful from a user-facing perspective, to some Go language function docs; this leaks the detail of the implementation language (Go) to the user. we have done this in other places so there is precedent, and i assume it's intentional here, but wanted to drop a (non-blocking) comment. in a perfect world we would not have these but i understand why we do :-)

@@ -175,7 +175,7 @@ Option | Value | Description
<a name="on-error"></a>`on_error` | `pause` / `fail` | Use `on_error=pause` to pause the changefeed when encountering **non**-retryable errors. `on_error=pause` will pause the changefeed instead of sending it into a terminal failure state. **Note:** Retryable errors will continue to be retried with this option specified. <br><br>Use with [`protect_data_from_gc_on_pause`](#protect-pause) to protect changes from [garbage collection]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds).<br><br>If a changefeed with `on_error=pause` is running when a watched table is [truncated]({% link {{ page.version.version }}/truncate.md %}), the changefeed will pause but will not be able to resume reads from that table. Using [`ALTER CHANGEFEED`]({% link {{ page.version.version }}/alter-changefeed.md %}) to drop the table from the changefeed and then [resuming the job]({% link {{ page.version.version }}/resume-job.md %}) will work, but you cannot add the same table to the changefeed again. Instead, you will need to [create a new changefeed](#start-a-new-changefeed-where-another-ended) for that table.<br><br>Default: `on_error=fail`
<a name="protect-pause"></a>`protect_data_from_gc_on_pause` | N/A | When a [changefeed is paused]({% link {{ page.version.version }}/pause-job.md %}), ensure that the data needed to [resume the changefeed]({% link {{ page.version.version }}/resume-job.md %}) is not garbage collected. If `protect_data_from_gc_on_pause` is **unset**, pausing the changefeed will release the existing protected timestamp records. It is also important to note that pausing and adding `protect_data_from_gc_on_pause` to a changefeed will not protect data if the [garbage collection]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds) window has already passed. <br><br>Use with [`on_error=pause`](#on-error) to protect changes from garbage collection when encountering non-retryable errors. <br><br>See [Garbage collection and changefeeds]({% link {{ page.version.version }}/changefeed-messages.md %}#garbage-collection-and-changefeeds) for more detail on protecting changefeed data.<br><br>**Note:** If you use this option, changefeeds that are left paused for long periods of time can prevent garbage collection. Use with the [`gc_protect_expires_after`](#gc-protect-expire) option to set a limit for protected data and for how long a changefeed will remain paused.
<span class="version-tag">New in v23.1:</span> `pubsub_sink_config` | [`STRING`]({% link {{ page.version.version }}/string.md %}) | Set fields to configure sink batching and retries. The schema is as follows:<br><br> `{ "Flush": { "Messages": ..., "Bytes": ..., "Frequency": ..., }, "Retry": {"Max": ..., "Backoff": ..., } }`. <br><br>**Note** that if either `Messages` or `Bytes` are nonzero, then a non-zero value for `Frequency` must be provided. <br><br>Refer to [Pub/Sub sink configuration]({% link {{ page.version.version }}/changefeed-sinks.md %}#pub-sub-sink-configuration) for more details on using this option.
<a name="resolved-option"></a>`resolved` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events per changefeed in a format dependent on the connected sink. Resolved timestamp events do not emit until the changefeed job has reached a checkpoint. <br><br>Set an optional minimal duration between emitting resolved timestamps. Example: `resolved='10s'`. This option will **only** emit a resolved timestamp event if the timestamp has advanced and at least the optional duration has elapsed. If a duration is unspecified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set [`min_checkpoint_frequency`](#min-checkpoint-frequency) to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be exactly the same text as on changefeed-for.md, in which case LGTM, but please let me know if i didn't read closely enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's isn't quite the same, but I think Andy reviewed it extensively enough. It has been annoying since (as you mention) the diff isn't really a diff in the markdown tables.

@@ -139,7 +139,7 @@ Option | Value | Description
<a name="on-error"></a>`on_error` | `pause` / `fail` | Use `on_error=pause` to pause the changefeed when encountering **non**-retryable errors. `on_error=pause` will pause the changefeed instead of sending it into a terminal failure state. **Note:** Retryable errors will continue to be retried with this option specified. <br><br>Use with [`protect_data_from_gc_on_pause`](#protect-data-from-gc-on-pause) to protect changes from [garbage collection]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds).<br><br>If a changefeed with `on_error=pause` is running when a watched table is [truncated]({% link {{ page.version.version }}/truncate.md %}), the changefeed will pause but will not be able to resume reads from that table. Using [`ALTER CHANGEFEED`]({% link {{ page.version.version }}/alter-changefeed.md %}) to drop the table from the changefeed and then [resuming the job]({% link {{ page.version.version }}/resume-job.md %}) will work, but you cannot add the same table to the changefeed again. Instead, you will need to [create a new changefeed](#start-a-new-changefeed-where-another-ended) for that table.<br><br>Default: `on_error=fail`
<a name="protect-data-from-gc-on-pause"></a>`protect_data_from_gc_on_pause` | N/A | This option is deprecated as of v23.2 and will be removed in a future release.<br><br>When a [changefeed is paused]({% link {{ page.version.version }}/pause-job.md %}), ensure that the data needed to [resume the changefeed]({% link {{ page.version.version }}/resume-job.md %}) is not garbage collected. If `protect_data_from_gc_on_pause` is **unset**, pausing the changefeed will release the existing protected timestamp records. It is also important to note that pausing and adding `protect_data_from_gc_on_pause` to a changefeed will not protect data if the [garbage collection]({% link {{ page.version.version }}/configure-replication-zones.md %}#gc-ttlseconds) window has already passed. <br><br>Use with [`on_error=pause`](#on-error) to protect changes from garbage collection when encountering non-retryable errors. <br><br>Refer to [Protect Changefeed Data from Garbage Collection]({% link {{ page.version.version }}/protect-changefeed-data.md %}) for more detail on protecting changefeed data.<br><br>**Note:** If you use this option, changefeeds that are left paused for long periods of time can prevent garbage collection. Use with the [`gc_protect_expires_after`](#gc-protect-expires-after) option to set a limit for protected data and for how long a changefeed will remain paused.
<a name="pubsub-sink-config"></a>`pubsub_sink_config` | [`STRING`]({% link {{ page.version.version }}/string.md %}) | Set fields to configure sink batching and retries. The schema is as follows:<br><br> `{ "Flush": { "Messages": ..., "Bytes": ..., "Frequency": ..., }, "Retry": {"Max": ..., "Backoff": ..., } }`. <br><br>**Note** that if either `Messages` or `Bytes` are nonzero, then a non-zero value for `Frequency` must be provided. <br><br>Refer to [Pub/Sub sink configuration]({% link {{ page.version.version }}/changefeed-sinks.md %}#pub-sub-sink-configuration) for more details on using this option.
<a name="resolved"></a>`resolved` | [Duration string](https://pkg.go.dev/time#ParseDuration) | Emits [resolved timestamp]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) events per changefeed in a format dependent on the connected sink. Resolved timestamp events do not emit until the changefeed job has reached a checkpoint. <br><br>Set an optional minimal duration between emitting resolved timestamps. Example: `resolved='10s'`. This option will **only** emit a resolved timestamp event if the timestamp has advanced and at least the optional duration has elapsed. If a duration is unspecified, all resolved timestamps are emitted as the high-water mark advances.<br><br>**Note:** If you set `resolved` lower than `30s`, then you **must** also set [`min_checkpoint_frequency`](#min-checkpoint-frequency) to at minimum the same value as `resolved`, because `resolved` messages may be emitted less frequently than `min_checkpoint_frequency`, but cannot be emitted more frequently.<br><br>Refer to [Resolved messages]({% link {{ page.version.version }}/changefeed-messages.md %}#resolved-messages) for more detail.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside: github does a poor job presenting these one-really-really-long-line diffs, it just paints them all red/green and doesn't bother doing a proper "word diff" (which it does do for shorter lines)

one day they'll pay for their crimes...

@kathancox kathancox force-pushed the resolved-option-description branch from 0b6e62f to 2fa6547 Compare February 11, 2025 20:39
@kathancox kathancox merged commit 12cf276 into main Feb 11, 2025
6 checks passed
@kathancox kathancox deleted the resolved-option-description branch February 11, 2025 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants