Skip to content

Conversation

@gow
Copy link
Contributor

@gow gow commented Oct 22, 2025

Adding proto changes that are required to implement Workflow Pause/Unpause feature. Added,

  • WorkflowExecutionStatus.WORKFLOW_EXECUTION_STATUS_PAUSED enum
  • EVENT_TYPE_WORKFLOW_EXECUTION_PAUSED & EVENT_TYPE_WORKFLOW_EXECUTION_PAUSED event enums and corresponding event attributes
  • PauseWorkflowExecution & UnpauseWorkflowExecution gRPC along with corresponding request/responses.

Why?
These changes are needed to implement workflow pause/unpause feature.

Breaking changes
Introduction of new status WORKFLOW_EXECUTION_STATUS_PAUSED may break any saved visibility queries that may have assumed WorkflowExecutionStatus != 'Running' returns all closed workflows.

@gow gow requested review from a team as code owners October 22, 2025 21:58
@gow gow marked this pull request as draft October 22, 2025 21:59
@gow gow marked this pull request as ready for review October 22, 2025 22:43
Comment on lines +873 to +899
message WorkflowExecutionPausedEventAttributes {
// The identity of the client who paused the workflow execution.
string identity = 1;
// The reason for pausing the workflow execution.
string reason = 2;
// The request ID of the request that paused the workflow execution.
string request_id = 3;
}

// Attributes for an event marking that a workflow execution was unpaused.
message WorkflowExecutionUnpausedEventAttributes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to add these to the actual oneof below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I had forgotten them. Added now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to wait to merge this until the implementation is present so we can confirm these types of things work with the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will not merge until the corresponding server PR using this is not ready.

Comment on lines 1263 to 1264
// PauseWorkflowExecution pauses the workflow execution specified in the request.
rpc PauseWorkflowExecution (PauseWorkflowExecutionRequest) returns (PauseWorkflowExecutionResponse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should clarify what pause actually does in docs here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more detailed description. Also annotated that these APIs are experimental

// The identity of the client who initiated this request.
string identity = 3;
// Reason to pause the workflow execution.
string reason = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I see this reason in visibility? Or at least via describe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The reason will be indexed in visibility and also be returned in describe response (to be shown in CLI & WebUI)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it in the protos that come back from list/describe. We aren't making a new search attribute for this right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had missed them. Will update.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added WorkflowExecutionPauseInfo

Re: search attribute - I think we can reuse an existing attribute. Will check.

@gow gow requested a review from cretz October 23, 2025 18:54
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw this targets main. Can the implementation be completed before we do that?

// The identity of the client who initiated this request.
string identity = 3;
// Reason to pause the workflow execution.
string reason = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see it in the protos that come back from list/describe. We aren't making a new search attribute for this right?

Comment on lines +873 to +899
message WorkflowExecutionPausedEventAttributes {
// The identity of the client who paused the workflow execution.
string identity = 1;
// The reason for pausing the workflow execution.
string reason = 2;
// The request ID of the request that paused the workflow execution.
string request_id = 3;
}

// Attributes for an event marking that a workflow execution was unpaused.
message WorkflowExecutionUnpausedEventAttributes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to wait to merge this until the implementation is present so we can confirm these types of things work with the code

};
}

// Note: This is an experimental API and the behavior may change in a future release.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this doesn't mean we can just change things at will. People deploy proxies on certain API versions and may not upgrade them immediately on every backwards incompatible change. We have to try our best to not land it until we're pretty happy with the implementation.

// - The workflow execution status changes to `PAUSED` and a new WORKFLOW_EXECUTION_PAUSED event is added to the history
// - No new workflow tasks or activity tasks are dispatched.
// - Any workflow task currently executing on the worker will be allowed to complete.
// - Any activity task currently executing will be allowed to complete. But the heartbeat response will be annotated with 'activity_paused=true'.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic doc note - It won't be allowed to complete if the activity is properly getting interruptions. Also, this doesn't cover activities backing off. I'd rephrase this to just say activities currently running are paused and let the definition of activity pause be enough instead of trying to re-describe it here.

// - No new workflow tasks or activity tasks are dispatched.
// - Any workflow task currently executing on the worker will be allowed to complete.
// - Any activity task currently executing will be allowed to complete. But the heartbeat response will be annotated with 'activity_paused=true'.
// - Signals, child workflows, timers & timeouts, activity completion etc will continue to be processed by the server. No completed work is lost during pause.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic doc note - I would just say "all server-side events will continue to be emitted" and not try to list them here since they'll become stale.

// UnpauseWorkflowExecution unpauses a previously paused workflow execution specified in the request.
// Unpausing a workflow execution results in
// - The workflow execution status changes to `RUNNING` and a new WORKFLOW_EXECUTION_UNPAUSED event is added to the history
// - New workflow task and activity tasks are dispatched (if required).
Copy link
Member

@cretz cretz Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic doc note - I think you should just say activities are unpaused and use existing activity unpause definition instead of attempting to re-describe here

@gow
Copy link
Contributor Author

gow commented Oct 24, 2025

I saw this targets main. Can the implementation be completed before we do that?

Absolutely, I'll leave this up in this branch and not merge it until server PR is ready. Did you have a different workflow in mind?

@cretz
Copy link
Member

cretz commented Oct 24, 2025

Did you have a different workflow in mind?

Nope, this works for me. I only care about main heh.

temporal.api.enums.v1.UpdateAdmittedEventOrigin origin = 2;
}

// Attributes for an event marking that a workflow execution was paused.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Attributes for an event marking that a workflow execution was paused.
// Attributes for an event marking that a workflow execution was paused.

Copy link
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few small non-blocking comments.

// Namespace of the workflow to unpause.
string namespace = 1;
// Execution info of the workflow to unpause.
temporal.api.common.v1.WorkflowExecution workflow_execution = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debating whether we want to keep using this struct, we decided against a wrapper struct for this id, run_id tuple for all of the activity APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I think we decided to inline these fields. Will make the change.

Comment on lines 890 to 893
// The ID of the `WORKFLOW_EXECUTION_PAUSED` event that this unpause corresponds to.
int64 paused_event_id = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? It's going to be a bit of a hassle to implement due to buffered events but not a big deal, just beware.

Copy link
Contributor Author

@gow gow Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I may need it to quickly find the pause event when we are cherry picking the events. I'll remove it for now and add it if needed.
Re: buffered events - this should not be an issue. By the time we record unpause event, we are guaranteed to have the pause event recorded and the event ID generated for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this for now. Did not need it.

WORKFLOW_EXECUTION_STATUS_TERMINATED = 5;
WORKFLOW_EXECUTION_STATUS_CONTINUED_AS_NEW = 6;
WORKFLOW_EXECUTION_STATUS_TIMED_OUT = 7;
WORKFLOW_EXECUTION_STATUS_PAUSED = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifying that we have consensus for adding this new status... this is fine with me FTR.

Copy link
Member

@cretz cretz Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am the only holdout. I still think it breaks Temporal expectation that a workflow is either running or closed, and that running has nothing to do with whether tasks are being processed (or signals or activities or timers). Having said that, I think unfortunately people are wanting to break these expectations.

I also think it's confusing to assume the entire workflow is in a paused status just because some things are paused. When a user asks us to support, say, pausing only signals/updates, it's gonna look real strange that we chose to treat pause as a status instead of what it truly is - a collection of paused aspects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had the consensus in the internal discussion about this. The reasoning was that pausing a workflow would set an expectation that the workflow status to change to PAUSED. Otherwise "workflow pause" operation appears broken.
If (and when) we support pausing of individual aspects we would have individual status for them to make it clear that only certain aspects are paused.

Copy link
Member

@cretz cretz Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had the consensus in the internal discussion about this

👍

The reasoning was that pausing a workflow would set an expectation that the workflow status to change to PAUSED

I don't think this is true that it would set this expectation any more than "cancel" would set an expectation of a workflow status changing to "cancel requested". Basically, there is no such thing as pausing a workflow, just pausing certain parts. I want to warn about user confusion by marking what Temporal deems to be a started/running workflow as not running.

If (and when) we support pausing of individual aspects we would have individual status for them to make it clear that only certain aspects are paused.

To clarify here, you mean the workflow status would remain running if we only paused signals, updates, etc? Or we would set it to paused even if some of it is paused? That's what we're doing with this change, e.g. the timers are not paused, but we are acting as if the whole workflow is paused when it is not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not completely sure. It depends on how we support these.
If pausing these individual aspects is a top level operation then it would have a separate status (like activity pause). If it is an option to workflow pause then the workflow status would be changed.
I'm also not sure if we will support pausing just the individual aspects without pausing the workflow tasks.

Copy link
Member

@cretz cretz Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure if we will support pausing just the individual aspects without pausing the workflow tasks.

Note, you are making a permanent "can't take back" change here for something you may not be sure about in successive phases. But if we are defining status "paused" as "workflow tasks are paused" and not some general definition of "pause", ok (even if I do disagree with the premise).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants