tidy: add support for `--extra-checks=auto:` feature #143398

lolbinarycat · 2025-07-03T19:44:59Z

in preparation for #142924

also heavily refactored the parsing of the --extra-checks argument to warn about improper usage.

now does proper parsing of git's output and falls back to assuming all files are modified if `git` doesn't work. accepts a closure so extensions can be checked.

currently this just uses a very simple extension-based heirustic.

rustbot · 2025-07-03T19:45:06Z

This PR modifies src/bootstrap/src/core/config.

If appropriate, please update CONFIG_CHANGE_HISTORY in src/bootstrap/src/utils/change_tracker.rs.

There are changes to the tidy tool.

cc @jieyouxu

Kobzol · 2025-07-04T07:08:23Z

Thanks for working on this! To reiterate (writing this to remind myself), the use-case for this is that you want to be running some extra checks locally (which you can now configure through bootstrap.toml), but you don't want to run for them to execute if no relevant files were modified.

I'm not fully sold on the auto: prefix idea. The syntax for extra checks starts to be quite complicated, tbh.. 😆 I don't know if anyone ever has a use-case for "only check formatting of Python files when they change, but always lint Python files", which can be now said with auto:py:fmt, py:lint.

What do you think about simplifying the automatic detection syntax specification by just adding an extra option called e.g. "auto" or "auto-detect"? So that you would say "--extra-checks=auto-detect,py:lint,cpp:fmt, and auto-detect` would just enable the automatic detection for all extra checks (which is likely what people want for local usage anyway).

Btw, we already have a precedent in tidy for only formatting files that were modified locally (and this cannot really be configured). In theory, if we were able to make the fmt/lint Python/C++/Shell heuristic that detects which relevant files were changed 100% bulletproof, we could just always apply the extra checks only if any of these files were modified, without adding "auto".

That being said, I'm not sure that we can make the heuristic so solid... The problem that could occur is that if some relevant changed files are not detected locally, then tidy can be green locally, but red on CI (because CI checks everything, as we can't afford to just detect changed files there). That's an annoying situation. With your PR, people would at least have to opt into this behavior, it wouldn't be the default, unlike Rust file formatting.

Kobzol · 2025-07-04T06:45:05Z

src/tools/tidy/src/rustdoc_json.rs

-        }
+    if crate::files_modified(base_commit, |p| p == RUSTDOC_JSON_TYPES) {
+        // `rustdoc-json-types` was not modified so nothing more to check here.
+        println!("`rustdoc-json-types` was not modified.");


Shouldn't there be return; here? Also, shouldn't the condition be opposite, i.e. if !crate::files_modified?

Kobzol · 2025-07-04T06:46:33Z

src/tools/tidy/src/rustdoc_json.rs

-            eprintln!("error: failed to run `git diff` in rustdoc_json check");
-            return;
-        }
+    if crate::files_modified(base_commit, |p| p == RUSTDOC_JSON_TYPES) {


Shouldn't the == be something like starts_with or contains?

Kobzol · 2025-07-04T06:49:54Z

src/tools/tidy/src/lib.rs

+                    .expect("bad format from `git diff --name-status`");
+                if status == "M" { Some(name) } else { None }
+            });
+            for modified_file in modified_files {


modified_files.any(pred)

Kobzol · 2025-07-04T06:51:07Z

src/tools/tidy/src/lib.rs

+/// Returns true if any modified file matches the predicate, if we are in CI, or if unable to list modified files.
+pub fn files_modified(ci_info: &CiInfo, pred: impl Fn(&str) -> bool) -> bool {
+    let Some(base_commit) = &ci_info.base_commit else {
+        eprintln!("No base commit, assuming all files are modified");


Could we panic here if we're on CI? This wuold be a serious issue if the commit was missing for some reason, we shouldn't just skip it.

In fact, let's just return true from files_modified if we're on CI. We should always check everything on CI.

lolbinarycat · 2025-07-04T08:33:23Z

What do you think about simplifying the automatic detection syntax specification by just adding an extra option called e.g. "auto" or "auto-detect"?

that was the original idea, but we can't do that because shellcheck is not run in CI, and I don't wanna have auto just exclude shellcheck because reasons. Also if someone has only some of the linters it's nice to give them the option i suppose.

The parsing refactor is complicated, but it was particularly needed to provide actual input validation, and the auto: thing was the simplest thing that dealt with the shellcheck inconsistency without feeling like a weird hack that is completely unintuitive.

Kobzol · 2025-07-04T08:42:03Z

that was the original idea, but we can't do that because shellcheck is not run in CI, and I don't wanna have auto just exclude shellcheck because reasons. Also if someone has only some of the linters it's nice to give them the option i suppose.

I'm not sure if I understand, this should have nothing to do with CI or shellcheck. auto would never get used on CI, only locally, and if enabled, it would essentially behave exactly as if you automatically applied the auto: prefix to all passed extra checks. Nothing less, nothing more.

I'm just proposing to turn --extra-checks=auto:py:fmt,auto:py:lint,cpp into --extra-checks=auto,py:fmt,py:lint,cpp.

lolbinarycat · 2025-07-04T09:57:07Z

I'm not sure if I understand, this should have nothing to do with CI or shellcheck.

the reason why it matters if someone sets a blanket auto and then modifies a single shell script, tidy will suddenly start spewing hundreds of shellcheck warnings. this seems undesirable.

saying "auto just ignores shell" feels like a footgun and is also very inelegant.

Kobzol · 2025-07-04T10:10:30Z

Auto would only modify the detection logic. If you don't specify shell in the extra checks, shellcheck won't be run. Saying just --extra-checks:auto wouldn't do anything.

GuillaumeGomez · 2025-07-04T12:23:22Z

src/tools/tidy/src/ext_tool_checks.rs

+        let Some(mut first) = parts.next() else {
+            return Err(ExtraCheckParseError::Empty);
+        };
+        if first == "auto" {


Could we have an all keyword too? Could be convenient for CI to ensure we didn't miss any.

While all would be a good idea, we don't currently run shellcheck on CI, so it wouldn't be used there.

A reason why we don't? Shells are currently a horror show?

I'll try to ignore my OCD and to NOT take a look.

the reason is because previous attempts at adding shellcheck introduced a bunch of subtle bugs, there's a t-infra zulip thread where i asked about this.

lolbinarycat · 2025-07-05T06:40:56Z

Auto would only modify the detection logic. If you don't specify shell in the extra checks, shellcheck won't be run. Saying just --extra-checks:auto wouldn't do anything.

I don't think that's actually simpler to implement tbh, and I think the argument against it exists in your own post: it adds new trivial cases that could be mistaken as meaning something else. it is also seems less intuitive and harder to explain in documentation, how it looks like other items but behaves differently, how it means nothing on it's own, etc...

it could maybe be simpler to implement if i completely rewrote my implementation from scratch, but i think in doing so i would need to give up some error reporting, so it would be a significant amount of work to redo things to get something slightly less robust.

Kobzol · 2025-07-05T07:34:26Z

I didn't necessarily mean simpler to implement, but having a simpler mental model for the functionality. Now it's in a sense too powerful, as I said before, likely no one will ever need auto:py:lint,py:fmt. Of course using --extra-checks=auto is indeed a hack, it should be a separate flag like --only-check-changed or something.

But I don't want to block this further. Please wait until #143452 is merged (perhaps in #143473), then rebase and r=me.

lolbinarycat · 2025-07-05T15:10:36Z

Unfortunatly #134006 does something kinda odd with the extra check args that is incompatible with that model, so I think I need to rewrite that to use --bless instead?

I don't think there's any way to avoid having that be a breaking change, though.

lolbinarycat · 2025-07-05T15:11:49Z

I think I'm gonna submit a seperate PR changing how that works, as that should really be done first to avoid messy issues.

lolbinarycat added 5 commits July 3, 2025 13:21

tidy: refactor --extra-checks parsing

8c32e87

tidy: factor out change detection logic and make it more robust

512cab0

now does proper parsing of git's output and falls back to assuming all files are modified if `git` doesn't work. accepts a closure so extensions can be checked.

tidy: update files_modified to take CiInfo

2d0f2ab

tidy: add auto: prefix to --extra-checks syntax

9887e63

currently this just uses a very simple extension-based heirustic.

tidy: warn when --extra-checks is passed an invalid lang:kind combo

ebdd5d3

rustbot assigned Kobzol Jul 3, 2025

rustbot added A-tidy Area: The tidy tool S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Jul 3, 2025

Kobzol reviewed Jul 4, 2025

View reviewed changes

GuillaumeGomez reviewed Jul 4, 2025

View reviewed changes

lolbinarycat mentioned this pull request Jul 5, 2025

tidy: use --bless for tidy spellcheck instead of spellcheck:fix #143493

Open

tidy: add support for --extra-checks=auto: feature #143398

Are you sure you want to change the base?

tidy: add support for --extra-checks=auto: feature #143398

Conversation

lolbinarycat commented Jul 3, 2025 • edited by rustbot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Jul 3, 2025

Uh oh!

Kobzol commented Jul 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolbinarycat commented Jul 4, 2025

Uh oh!

Kobzol commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lolbinarycat commented Jul 4, 2025

Uh oh!

Kobzol commented Jul 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lolbinarycat commented Jul 5, 2025

Uh oh!

Kobzol commented Jul 5, 2025

Uh oh!

lolbinarycat commented Jul 5, 2025

Uh oh!

lolbinarycat commented Jul 5, 2025

Uh oh!

Uh oh!

tidy: add support for `--extra-checks=auto:` feature #143398

tidy: add support for `--extra-checks=auto:` feature #143398

lolbinarycat commented Jul 3, 2025 •

edited by rustbot

Loading

Kobzol commented Jul 4, 2025 •

edited

Loading