Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled jobs often fail due to db timeout #1887

Open
ehuss opened this issue Jan 23, 2025 · 1 comment
Open

Scheduled jobs often fail due to db timeout #1887

ehuss opened this issue Jan 23, 2025 · 1 comment
Labels
A-scheduled-jobs Area: Scheduled jobs bug Something isn't working

Comments

@ehuss
Copy link
Contributor

ehuss commented Jan 23, 2025

The scheduled jobs thread regularly fails with the following:

database connection error: error communicating with the server: Connection timed out (os error 110)
thread 'main' panicked at /src/db.rs:239:47:
called `Result::unwrap()` on an `Err` value: Getting jobs data
Caused by:
    connection closed
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2025-01-23T17:33:20.088253Z ERROR triagebot: run_scheduled_jobs task died (error=panic)

or

database connection error: error communicating with the server: Connection timed out (os error 110)
thread 'main' panicked at src/main.rs:377:26:
called `Result::unwrap()` on an `Err` value: database schedule jobs
Caused by:
    0: Inserting job
    1: connection closed
2025-01-23T17:37:11.484202Z ERROR triagebot: schedule_jobs task died (error=panic)

I'm uncertain what might be happening. There is code to check for closed connections, but maybe it is getting dropped without being closed properly? It seems peculiar that it is only affecting the scheduled jobs, since I would expect stale cached connections to affect anything touching the database.

@ehuss ehuss added bug Something isn't working A-scheduled-jobs Area: Scheduled jobs labels Jan 23, 2025
@ehuss
Copy link
Contributor Author

ehuss commented Feb 7, 2025

I would expect stale cached connections to affect anything touching the database.

My new theory is that it is affecting webhooks, however it is manifesting as a hang, and the webhook gets killed after 10s, and there is no log message when the webhook thread is killed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-scheduled-jobs Area: Scheduled jobs bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant