-
Notifications
You must be signed in to change notification settings - Fork 78
[DO NOT MERGE] Add initial WorkflowService API definitions for durable non-workflow Nexus calls #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| } | ||
|
|
||
| // Defines whether to allow re-using a Nexus operation ID from a previously *closed* operation. | ||
| // If the request is denied, the server returns a `NexusOperationAlreadyStarted` error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be an ExecutionAlreadyStarted error.
| string type = 2; | ||
| } | ||
|
|
||
| // When StartNexusOperation uses the OPERATION_ID_CONFLICT_POLICY_USE_EXISTING and there is already an existing running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this for now, we're not supporting callbacks and attaching to the same operation at this point.
| // Incremented each time a new attempt is made to send a StartOperation or RequestCancelOperation request to the | ||
| // operation handler. | ||
| int32 attempt = 9; | ||
| int32 maximum_attempts = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not part of the spec.
| // If there is no next retry allowed, this field will be null. | ||
| google.protobuf.Duration current_retry_interval = 14; | ||
| // The time when the last request attempt completed. If the StartOperation request has not been completed yet, it will be null. | ||
| google.protobuf.Timestamp last_attempt_complete_time = 15; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add a last_failure_time. We are adding this to standalone activities to preserve the last failure information even when there is a subsequent successful attempt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dandavison @fretz12 is this the name you are going with for standalone activities?
| int64 state_transition_count = 17; | ||
|
|
||
| temporal.api.common.v1.SearchAttributes search_attributes = 18; | ||
| temporal.api.common.v1.Header header = 19; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't support headers for nexus operations today.
| // Callbacks to be called by the server when this operation reaches a terminal status. | ||
| // Callback addresses must be whitelisted in the server's dynamic configuration. | ||
| repeated temporal.api.common.v1.Callback completion_callbacks = 20; | ||
| // Links to be associated with the operation. | ||
| repeated temporal.api.common.v1.Link links = 21; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will want to store backlinks though (as returned by the handler).
| temporal.api.worker.v1.WorkerInfo worker_info = 1; | ||
| } | ||
|
|
||
| message StartNexusOperationRequest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about schedule to close timeout and cancellation type?
| } | ||
|
|
||
| // DeleteNexusOperation asynchronously deletes a specific Nexus operation (when run_id is provided) or the latest | ||
| // activity execution (when run_id is not provided). If the Nexus operation is running, it will be terminated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy paste error.
| // | ||
| // (-- api-linter: core::0127::http-annotation=disabled | ||
| // aip.dev/not-precedent: Nexus has a separately defined StartOperation HTTP API. --) | ||
| rpc StartNexusOperation (StartNexusOperationRequest) returns (StartNexusOperationResponse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need an HTTP API for this given that the durable capabilities are very different from the pure Nexus HTTP spec.
|
|
||
| // RequestCancelNexusOperation requests cancellation of a Nexus operation. | ||
| // | ||
| // Requesting to cancel an operation does not automatically transition the operation to canceled status. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to mention cancellation types here.
| // TerminateNexusOperation terminates an existing Nexus operation immediately. | ||
| // | ||
| // Termination does not reach the operation handler and the handler cannot react to it. A terminated operation may | ||
| // have a running attempt and will be requested to be canceled by the server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "requested to be canceled by the server"?
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't do a deep dive, just a shallow look
| // Operation run ID. If empty the request targets the latest run. | ||
| string run_id = 3; | ||
| // Include the info field from the response. | ||
| bool include_info = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also a discussion at #640 (comment) concerning this. I think this should be exclude_info and info should be present by default for API clarity. It doesn't make sense for curl https://api.temporal.io/namespaces/my-ns/nexus/operations/my-op to not return info and basically return empty body.
| int32 page_size = 2; | ||
| // Token returned in ListNexusOperationsResponse. | ||
| bytes next_page_token = 3; | ||
| // Visibility query, see https://docs.temporal.io/list-filter for the syntax. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with standalone activities, somewhere we need to document the applicable search attributes
| } | ||
|
|
||
| // Information about a Nexus operation. | ||
| message NexusOperationInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For standalone activities, we were provided a guarantee that 1) everything in list info would also be in here, and 2) there will be server tests to make sure that something doesn't slip through and get added to list and not here. Can this guarantee also be extended to this project? The SDK will be performing proper model reuse on these disparate-in-API-only models. But things like close time not being here I think is an issue (though I guess we can make users kinda derive it from last attempt completion times if status is done, but it is rough that we do this to our users, they may just prefer to call list for this one item).
| // | ||
| // (-- api-linter: core::0127::http-annotation=disabled | ||
| // aip.dev/not-precedent: Nexus has a separately defined CancelOperation HTTP API. --) | ||
| rpc RequestCancelNexusOperation (RequestCancelNexusOperationRequest) returns (RequestCancelNexusOperationResponse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of these should still have HTTP API endpoints as part of our standard HTTP API approach. Not sure the cancel operation HTTP API that implementers may implement is the same as the outbound caller one with our auth and such that we put on this API.
What changed?
Added new protos and WorkflowService API definitions to support the initial functionality required for making durable calls to Nexus operations from non-workflow callers.
Why?
New feature. Mimics the work done for standalone activities in #640
Note this only covers the Durable delivery case, which we plan to be the first implementation and default delivery method. Support for Direct calls will come later, once the UX for that is more clearly defined.