Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inoculate test run timeout is bad and i should feel bad about it #512

Open
hawkw opened this issue Jan 11, 2025 · 0 comments
Open

inoculate test run timeout is bad and i should feel bad about it #512

hawkw opened this issue Jan 11, 2025 · 0 comments
Assignees
Labels
area/devtools Related to development tools and build system, such as inoculate. kind/enhancement New feature or request

Comments

@hawkw
Copy link
Owner

hawkw commented Jan 11, 2025

as I said in #511 (comment)

We should probably make the timeout much smarter than just "kill the
QEMU process after 20 minutes", and instead do it only if we don't see
any interesting test output, or something. But, this is better than just
always failing test runs because we're stupid about timeouts, I guess.

basically, cargo inoculate test has a --timeout-secs flag, which fails the test run if tests don't complete within a specified timeout. this is intended to fail the run due to issues like the kernel deadlocking or tests not running at all. however, the way we do this currently is Real Dumb and Bad: we spawn the QEMU process with spawn_timeout, and abruptly kill it if it hassn't already exited after the timeout. this means that we might still be legitimately running tests, and as we add more tests, the likelihood of that happening increases.

instead, we probably should do something where we reset the timeout every time a test completes successfully, so we only kill the VM if one test has been wedged for a really long time, rather than bounding the total length of test suite execution.

also, we probably should tell the user that the tests timed out, instead of just killing the VM and then getting confused because QEMU exited abruptly. right now, the dev experience is pretty bad: we kill the VM and then the test runner prints a bunch of stuff about how it was really surprised that QEMU just randomly exited for no reason. my brother in Christ, you were the one that killed him! this incorrectly makes the user think that the kernel has done something so evil and wrong that made the VM crash, instead of "oh your tests took slightly too long to run and we timed out".

this is probably best done by making the inoculate test runner bits async and using tokio::process::Command rather than std::process::Command, so that we can select over a test starting/finishing or a timeout firing, and stuff.

@hawkw hawkw added kind/enhancement New feature or request area/devtools Related to development tools and build system, such as inoculate. labels Jan 11, 2025
@hawkw hawkw self-assigned this Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devtools Related to development tools and build system, such as inoculate. kind/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant