Skip to content

HTTP 503 when downloading CSVs #219

@adamhooper

Description

@adamhooper

When a workflow has been changed but not yet rendered (so its Steps' cached render results don't exist or are stale), requests to GET /public/moduledata/live/:id.(csv|json) will return HTTP 503.

Steps to reproduce:

  1. Create a workflow with a "Load HTML from URL" module
  2. Point it to https://www.nytimes.com and set auto-refresh every 5min
  3. Look up the "API endpoint" (/public/moduledata/live/:id.csv), and then close the browser window
  4. Six minutes later, request data from the endpoint.

Expected results: you get new data
Actual results: HTTP 503 -- but if you retry a few seconds later, you'll get data.

The problem: Workbench renders processes in the background, and a GET request is in the foreground. If the workflow isn't rendered, we can't know when it will render.

This plays badly with auto-refreshes: when auto-refreshing a step, if the workflow has no steps with notifications enabled and nobody has a web client open to the workflow, Workbench skips rendering altogether. (It will only render on-demand.)

The Workbench-side workaround: when we return HTTP 503, we schedule another render of the workflow, in case it hasn't been scheduled yet.

There are two user-side workarounds:

  1. Enable notifications on any step in the workflow. That will force a render every time data changes -- greatly reducing the amount of time a request would lead to an HTTP 503 response.
  2. Configure the client to retry after 10-30s upon HTTP 503.

A better solution is to let users "turn on" API endpoints instead of supplying them implicitly. API endpoints should always host valid data -- even if it's stale.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions