Skip to content

Slow request and not reading request body causes ConnectionResetError on large responses if keep-alive is disabled, even if worker timeout is disabled and in sync, async and threaded workers #3334

@michalc

Description

@michalc

Note

Earlier versions of this bug limited to GET requests, and limited to when the body of the request was empty and for transfer-encoding: chunked requests, but these were inaccurate. The issue happens on both GET and POST requests, and when the request body is non-empty, and for both chunked and non-chunked request bodies. Earlier versions also did not make clear the error happened even if the worker timeout is disabled, and were for just sync workers whereas the issue happens in all workers except tornado where keep-alive is not possible or disabled

If I make a large empty file:

mkfile -n 1g temp_1GB_file

A basic gunicorn+Flask server to serve it (without reading the http request body) in test.py:

from flask import Flask

app = Flask(__name__, static_folder='.')

@app.route('/<path:path>', methods=['GET', 'POST'])
def static_proxy(path):
    return app.send_static_file(path)

Run with disabled worker timeout (to exclude that as the issue), and on any of sync, gevent , eventlet or gthread worker classes, as long as keep-alive is not possible or disabled:

gunicorn --worker-class sync --workers=1 --timeout=0 --keep-alive=0 test:app

And then some code to slowly make a GET or POST request to retrieve the large file through the webserver, but in a way that forces the request to be slow, specifically being slow about both ending the request body and fetching the response:

import urllib3
import time

def data():
    time.sleep(4)
    yield b'Anything'

resp = urllib3.request("POST", "http://localhost:8000/temp_1GB_file", 
    body=data(),
    preload_content=False,
)
for chunk in resp.stream(65536):
    time.sleep(0.001)

Then almost always I get the error on the client:

urllib3.exceptions.ProtocolError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer'))

And no error on the server at all.

Interestingly:

  • if I remove the sleeps in the request code, the error doesn't occur
  • or if I use Flask's built-in webserver via flask --app test run --port 8080 instead of gunicorn, the error doesn't occur
  • or if I use uwsgi via uwsgi --http :8000 --wsgi-file test.py --callable app, the error doesn't occur
  • of if I re-enable keep-alive on the gevent, eventlet or gthread, the error doesn't occur
  • or if I access the bytes of the request body on the server before returning from the handler, say from print(request.data), then the error doesn't occur
  • or if I reduce the amount of data returned from the server, the error does not occur
  • or if I use the the tornado worker class, the error does not occur

And:

  • if I send the request body without chunked encoding, i.e. by passing headers={'content-length': '8'} in the above case, the issue still occurs

I'm on gunicorn 23.0.0 on macOS (but I suspect this also happens on Linux in Docker, since I'm chasing down a production issue)

Running Wireshark on the connection I do see an RST from server to client just before the error, and nothing obviously unusual before (although not sure if I would recognise much)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions