Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhooks #1622

Open
tobias93 opened this issue Nov 19, 2024 · 0 comments
Open

Webhooks #1622

tobias93 opened this issue Nov 19, 2024 · 0 comments

Comments

@tobias93
Copy link

It would be nice, If Steep could notify client applications whenever the status of a submitted workflow changes. This way, client applications could for example send notification emails to users, when a workflow finishes or fails. Currently, this is only possible via polling.

Proposed changes

Add an optional field called webhook to the workflow data model:

Example:

api: 4.7.0
webhook: "http://example.com/steep-webhook"
actions:
  - type: execute
    service: copy
    inputs:
      - id: input_file
        value: example1.txt
    outputs:
      - id: output_file
        var: outputFile1

Whenever the status of the submission changes, steep should make a request to the configured URL:

POST http://example.com/steep-webhook HTTP/1.1
Content-Type: application/json
X-Steep-Signature: sha256=757107ea0eb2509fc211221cce984b8a37570b6d7586c22c46f4379c8b043e17

{
    "timestamp": "2024-11-19T11:20:38.347Z",
    "submission": "akpm6yojjigral4cdxgq",
    "status": "SUCCESS"
}

Design Considerations

The Standard Webhooks specification mentions a few of the intricacies one should consider when implementing webhooks. It is definitely worth a read. (link)

Retry handling

To reduce the possibility of loosing events (e.g. when the client application is in maintenance, or there are network issues), steep should retry webhook delivery with exponential backoff until it succeeds with a HTTP 200 status code or a limit is reached. For the implementation of the client application, this means that the hook action should either be idempotent, or checked for duplicate events.

A first version of the webhooks feature would work perfectly fine without this though.

Authentication

Client applications receiving web hooks from steep might want to validate, that the request actually originated from steep.

It seems to be common practice to do this by including a HMAC of the request body as a header value (here: X-Steep-Signature). See for example here, how github does it with their webhooks.

The shared secret for the HMAC would be configured in steep.yaml.

A first version of the webhooks feature would work perfectly fine without this though.

Security

This feature allows anyone who can submit workflows to trigger http requests to arbitrary URLs that are reachable by steep. Depending on the specific deployment, this might allow for Server-side request forgery attacks.

Example:

A common scenario will be that steep is deployed next to a few other services in the same network. Some of the other services are meant to be private, not accessible to the general public. A shared firewall only allows access from outside to steep, but not to the private services. Since steep is already behind the firewall, an attacker could use webhooks to trigger requests to one of the private services that are not meant to be reachable to them.

To mitigate this, I would only allow webhooks to whitelisted urls. For example, steep.yaml could contain a list of regular expressions matching whitelisted webhook urls.

Alternative solutions

Of cause, the webhooks feature is not the only way how client applications could be notified of workflow status changes.

Some alternative solutions would be:

  • Alternative solution 1: Allow access to the event bus
    • Harder to connect to for client applications not using Vert.x / Hazelcast
    • Messages on the event bus are implementation-specific for the current steep version. They might change without notice in any future version.
  • Alternative solution 2: Configure a single http endpoint in in steep.yaml, that receives updates for ALL workflows.
    • Easier to implement
    • Fewer security-concerns, because the target URL can't be controlled by the user / client application.
    • Not as flexible
  • Alternative solution 3: Add an API endpoint like /workflows/:id/events, that uses http Server-sent events (SSE) or websockets to notify the client of any updates to a submission.
    • Many long-lasting http connections
    • Maybe the existing SockJs connection used by the Steep Web GUI can be reused here?
    • Harder to implement for the client application. Especially, if updates to more than one workflow are required.
  • Alternative solution 4: Keep using polling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant