Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a command and benchmark flags to remove artifact data #1643

Merged
merged 4 commits into from
Sep 28, 2023

Conversation

Kobzol
Copy link
Contributor

@Kobzol Kobzol commented Jul 8, 2023

There are basically three things that we can do during benchmarking if there is already some data in the database for the benchmarked artifact:

  1. Skip the benchmark (that's what we do now)
  2. Append new data to the old
  3. Remove the old data

I don't think that appending makes much sense. If nothing has changed between two benchmark invocations with the same artifact, you just want to fill in the data. If something has changed, then you probably don't want to mix and average results for the old and the new data. And we show a minimum of measured values in the dashboard anyway, so you might as well just remove the old data.

Removing the old data is useful when rerunning benchmarks repeatedly on a local machine, so that you don't have to remove the database over and over again.

This PR adds a new --purge flag to bench_local and bench_runtime_local that removes either any old data for the given artifact, or just data for benchmarks that had an error for the given artifact. The flag is idempotent and can be used even if there is no previous data.

The PR also adds a separate collector command purge_artifact, which removes all data for a specified artifact. This can be handy e.g. for removing data that should be re-benchmarked on the perf server.

Best reviewed commit by commit.

Fixes: #804

@lqd
Copy link
Member

lqd commented Jul 10, 2023

Can this be used to re-run a benchmark that happened to fail in previous invocations, and just replace its results ? Right now, when re-running collection a failure from a previous run will be skipped of course. Or would --purge-old remove everything from the previous run, and some kind of --purge-old=errors is more of what such situations would require ?

@Kobzol
Copy link
Contributor Author

Kobzol commented Jul 10, 2023

The --purge-old flag currently removes all data for the given artifact.

Do you have a use-case for deleting just some of the benchmarks (locally)? In general (and for perf. server usage), I consider it to be safer to re-benchmark everything, because something may (and probably did) change between the two collector invocations, so re-benchmarking just a subset of benchmarks could result in this subset having a different environment.

@lqd
Copy link
Member

lqd commented Jul 10, 2023

An example where being able to overwrite data would be useful to me: sometimes I have to babysit windows benchmarks, in case weird processes start up even though they are not allowed to and deactivated everywhere (à la virus scanners) which nullify the subsequent results. When this happens only a few benchmarks are tainted and I'd want to re-run only those locally, not re-launch a full collection since that will take at least 45 minutes. No artifacts have changed between the 2 sequential collections.

@Kobzol
Copy link
Contributor Author

Kobzol commented Jul 10, 2023

No artifacts have changed between the 2 sequential collections.

Well, not, but possibly some AV started to run in the middle of some benchmark, then it's a question whether that benchmark execution was valid or not :)

In the use-case that you have described, I assume that you just kill the benchmarking, then kill AV, and then restart it? In that case it should then compute only the benchmarks that haven't been finished yet (if you don't use --purge-old). Isn't that enough for your use-case?

@lqd
Copy link
Member

lqd commented Jul 10, 2023

I'm catching the AV red-handed so I know which benchmarks are invalid and I have to remove, and which benchmark has failed when I killed collection, and all these are ignored when re-running with the same artifact id, while I want to re-run them.

@Kobzol Kobzol force-pushed the db-purge-artifact branch 4 times, most recently from 3a1ba95 to 9823a23 Compare July 17, 2023 11:45
@Kobzol
Copy link
Contributor Author

Kobzol commented Aug 6, 2023

You reviewed a bunch of my newer PRs, so I'm not sure if this one hasn't been forgotten. CC @Mark-Simulacrum

@Kobzol
Copy link
Contributor Author

Kobzol commented Sep 28, 2023

Pinging @Mark-Simulacrum to requeue this PR to avoid bitrot :)

@Kobzol Kobzol merged commit a4d589f into rust-lang:master Sep 28, 2023
9 checks passed
@Kobzol Kobzol deleted the db-purge-artifact branch September 28, 2023 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a way to overwrite benchmarks
3 participants