-
Notifications
You must be signed in to change notification settings - Fork 78
[Workflow Pause] Proto changes #653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| message WorkflowExecutionPausedEventAttributes { | ||
| // The identity of the client who paused the workflow execution. | ||
| string identity = 1; | ||
| // The reason for pausing the workflow execution. | ||
| string reason = 2; | ||
| // The request ID of the request that paused the workflow execution. | ||
| string request_id = 3; | ||
| } | ||
|
|
||
| // Attributes for an event marking that a workflow execution was unpaused. | ||
| message WorkflowExecutionUnpausedEventAttributes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you forgot to add these to the actual oneof below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I had forgotten them. Added now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might want to wait to merge this until the implementation is present so we can confirm these types of things work with the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I will not merge until the corresponding server PR using this is not ready.
| // PauseWorkflowExecution pauses the workflow execution specified in the request. | ||
| rpc PauseWorkflowExecution (PauseWorkflowExecutionRequest) returns (PauseWorkflowExecutionResponse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should clarify what pause actually does in docs here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more detailed description. Also annotated that these APIs are experimental
| // The identity of the client who initiated this request. | ||
| string identity = 3; | ||
| // Reason to pause the workflow execution. | ||
| string reason = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I see this reason in visibility? Or at least via describe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The reason will be indexed in visibility and also be returned in describe response (to be shown in CLI & WebUI)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see it in the protos that come back from list/describe. We aren't making a new search attribute for this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had missed them. Will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added WorkflowExecutionPauseInfo
Re: search attribute - I think we can reuse an existing attribute. Will check.
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw this targets main. Can the implementation be completed before we do that?
| // The identity of the client who initiated this request. | ||
| string identity = 3; | ||
| // Reason to pause the workflow execution. | ||
| string reason = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see it in the protos that come back from list/describe. We aren't making a new search attribute for this right?
| message WorkflowExecutionPausedEventAttributes { | ||
| // The identity of the client who paused the workflow execution. | ||
| string identity = 1; | ||
| // The reason for pausing the workflow execution. | ||
| string reason = 2; | ||
| // The request ID of the request that paused the workflow execution. | ||
| string request_id = 3; | ||
| } | ||
|
|
||
| // Attributes for an event marking that a workflow execution was unpaused. | ||
| message WorkflowExecutionUnpausedEventAttributes { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might want to wait to merge this until the implementation is present so we can confirm these types of things work with the code
| }; | ||
| } | ||
|
|
||
| // Note: This is an experimental API and the behavior may change in a future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, this doesn't mean we can just change things at will. People deploy proxies on certain API versions and may not upgrade them immediately on every backwards incompatible change. We have to try our best to not land it until we're pretty happy with the implementation.
| // - The workflow execution status changes to `PAUSED` and a new WORKFLOW_EXECUTION_PAUSED event is added to the history | ||
| // - No new workflow tasks or activity tasks are dispatched. | ||
| // - Any workflow task currently executing on the worker will be allowed to complete. | ||
| // - Any activity task currently executing will be allowed to complete. But the heartbeat response will be annotated with 'activity_paused=true'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic doc note - It won't be allowed to complete if the activity is properly getting interruptions. Also, this doesn't cover activities backing off. I'd rephrase this to just say activities currently running are paused and let the definition of activity pause be enough instead of trying to re-describe it here.
| // - No new workflow tasks or activity tasks are dispatched. | ||
| // - Any workflow task currently executing on the worker will be allowed to complete. | ||
| // - Any activity task currently executing will be allowed to complete. But the heartbeat response will be annotated with 'activity_paused=true'. | ||
| // - Signals, child workflows, timers & timeouts, activity completion etc will continue to be processed by the server. No completed work is lost during pause. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic doc note - I would just say "all server-side events will continue to be emitted" and not try to list them here since they'll become stale.
| // UnpauseWorkflowExecution unpauses a previously paused workflow execution specified in the request. | ||
| // Unpausing a workflow execution results in | ||
| // - The workflow execution status changes to `RUNNING` and a new WORKFLOW_EXECUTION_UNPAUSED event is added to the history | ||
| // - New workflow task and activity tasks are dispatched (if required). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pedantic doc note - I think you should just say activities are unpaused and use existing activity unpause definition instead of attempting to re-describe here
Absolutely, I'll leave this up in this branch and not merge it until server PR is ready. Did you have a different workflow in mind? |
Nope, this works for me. I only care about |
| temporal.api.enums.v1.UpdateAdmittedEventOrigin origin = 2; | ||
| } | ||
|
|
||
| // Attributes for an event marking that a workflow execution was paused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Attributes for an event marking that a workflow execution was paused. | |
| // Attributes for an event marking that a workflow execution was paused. |
bergundy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few small non-blocking comments.
| // Namespace of the workflow to unpause. | ||
| string namespace = 1; | ||
| // Execution info of the workflow to unpause. | ||
| temporal.api.common.v1.WorkflowExecution workflow_execution = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debating whether we want to keep using this struct, we decided against a wrapper struct for this id, run_id tuple for all of the activity APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think we decided to inline these fields. Will make the change.
| // The ID of the `WORKFLOW_EXECUTION_PAUSED` event that this unpause corresponds to. | ||
| int64 paused_event_id = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? It's going to be a bit of a hassle to implement due to buffered events but not a big deal, just beware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I may need it to quickly find the pause event when we are cherry picking the events. I'll remove it for now and add it if needed.
Re: buffered events - this should not be an issue. By the time we record unpause event, we are guaranteed to have the pause event recorded and the event ID generated for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this for now. Did not need it.
| WORKFLOW_EXECUTION_STATUS_TERMINATED = 5; | ||
| WORKFLOW_EXECUTION_STATUS_CONTINUED_AS_NEW = 6; | ||
| WORKFLOW_EXECUTION_STATUS_TIMED_OUT = 7; | ||
| WORKFLOW_EXECUTION_STATUS_PAUSED = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verifying that we have consensus for adding this new status... this is fine with me FTR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I am the only holdout. I still think it breaks Temporal expectation that a workflow is either running or closed, and that running has nothing to do with whether tasks are being processed (or signals or activities or timers). Having said that, I think unfortunately people are wanting to break these expectations.
I also think it's confusing to assume the entire workflow is in a paused status just because some things are paused. When a user asks us to support, say, pausing only signals/updates, it's gonna look real strange that we chose to treat pause as a status instead of what it truly is - a collection of paused aspects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had the consensus in the internal discussion about this. The reasoning was that pausing a workflow would set an expectation that the workflow status to change to PAUSED. Otherwise "workflow pause" operation appears broken.
If (and when) we support pausing of individual aspects we would have individual status for them to make it clear that only certain aspects are paused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had the consensus in the internal discussion about this
👍
The reasoning was that pausing a workflow would set an expectation that the workflow status to change to PAUSED
I don't think this is true that it would set this expectation any more than "cancel" would set an expectation of a workflow status changing to "cancel requested". Basically, there is no such thing as pausing a workflow, just pausing certain parts. I want to warn about user confusion by marking what Temporal deems to be a started/running workflow as not running.
If (and when) we support pausing of individual aspects we would have individual status for them to make it clear that only certain aspects are paused.
To clarify here, you mean the workflow status would remain running if we only paused signals, updates, etc? Or we would set it to paused even if some of it is paused? That's what we're doing with this change, e.g. the timers are not paused, but we are acting as if the whole workflow is paused when it is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not completely sure. It depends on how we support these.
If pausing these individual aspects is a top level operation then it would have a separate status (like activity pause). If it is an option to workflow pause then the workflow status would be changed.
I'm also not sure if we will support pausing just the individual aspects without pausing the workflow tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure if we will support pausing just the individual aspects without pausing the workflow tasks.
Note, you are making a permanent "can't take back" change here for something you may not be sure about in successive phases. But if we are defining status "paused" as "workflow tasks are paused" and not some general definition of "pause", ok (even if I do disagree with the premise).
This reverts commit 0f38a7a.
Adding proto changes that are required to implement Workflow Pause/Unpause feature. Added,
WorkflowExecutionStatus.WORKFLOW_EXECUTION_STATUS_PAUSEDenumEVENT_TYPE_WORKFLOW_EXECUTION_PAUSED&EVENT_TYPE_WORKFLOW_EXECUTION_PAUSEDevent enums and corresponding event attributesPauseWorkflowExecution&UnpauseWorkflowExecutiongRPC along with corresponding request/responses.Why?
These changes are needed to implement workflow pause/unpause feature.
Breaking changes
Introduction of new status
WORKFLOW_EXECUTION_STATUS_PAUSEDmay break any saved visibility queries that may have assumedWorkflowExecutionStatus != 'Running'returns all closed workflows.