Skip to content

WIP: New lint: needless_path_new #14895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

ada4a
Copy link
Contributor

@ada4a ada4a commented May 25, 2025

fs::read_to_string(Path::new("foo.txt")) -> fs::read_to_string("foo.txt")

The commit history of this might be a bit messy -- it took me some time to figure this lint out. I cleaned it up a bit, but it's still not perfect

Fixes #14668

changelog: [needless_path_new]: new lint

WIP because:

  • it was suggested on Zulip to look into using trait obligations for impl AsRef<Path>. Later comments seem to not quite agree? As I said on Zulip, I don't know what these obligations are all about, so let me know if you think I should look into them (I'd appreciate some help as well)
  • I know that all the descriptions for this lint are pretty verbose -- I think its purpose is just hard to describe succintly. That would've been easier originally, since I thought the lint would apply only to strings enclosed in Path::new, but that was pretty easy to expand to "any impl AsRef<Path>", so I did that.

@rustbot
Copy link
Collaborator

rustbot commented May 25, 2025

r? @blyxyas

rustbot has assigned @blyxyas.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties label May 25, 2025
Comment on lines 76 to 77
name: &str,
fn_kind: &str,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two arguments are what makes the CI fail currently, since they are unused. I got them when copy-pasting from mut_reference, and I haven't removed them (yet) because I thought I could use them to give a better warning message. Something like:

format!("the {fn_kind} {name} accepts `AsRef<Path>`, \
and {x} implements that trait, so wrapping it in `Path::new` is unnecessary")

If you think it makes sense to have the message, I'd add it in -- otherwise I'd just remove these arguments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constant warnings got annoying, so I removed the arguments for now...


fn check_arguments<'tcx>(
cx: &LateContext<'tcx>,
arguments: &mut dyn Iterator<Item = &'tcx Expr<'tcx>>,
Copy link
Contributor Author

@ada4a ada4a May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this function signature from mut_reference -- should I maybe replace the dyn with impl here?

@samueltardieu
Copy link
Contributor

samueltardieu commented May 27, 2025

I think you need to check that you can use a AsRef<Path>, otherwise you might suggest removing Path::new() while you really need a &Path.

Don't forget to look at the lintcheck logs for your PR. You'll see the first hit is probably wrong, as it requires building a Option<&Path>, not a Option<dyn AsRef<Path>>.

@ada4a
Copy link
Contributor Author

ada4a commented May 27, 2025

Don't forget to look at the lintcheck logs for your PR.

oh, didn't know about that! thank you, that's very helpful

I think you need to check that you can use a AsRef<Path>, otherwise you might suggest removing Path::new() while you really need a &Path.

You'll see the first hit [in lintcheck] is probably wrong, as it requires building a Option<&Path>, not a Option<dyn AsRef<Path>>.

I thought I already covered that with the following line:

&& implements_asref_path(*parameter)

I even have a test case for exactly this!

// no warning
fn takes_path(_: &Path) {}
takes_path(Path::new("foo"));

But now that I think about it, the check above doesn't actually work?... It just checks that whatever type parameter has itself implements AsRef<Path>, and that is the case for Path. I wonder why the test even works in that case... Maybe something to do with Path vs &Path?

I guess what I need to check instead is that parameter is impl AsRef<Path>. is_type_diagnostic_item seems to be closest to what I want, but it wouldn't include the impl part? Btw that makes me wonder if I should check for dyn AsRef<Path> as well

@ada4a
Copy link
Contributor Author

ada4a commented May 27, 2025

But the weirdest thing is that Some(..) is not even a function/method call?... The lint is supposed to only check expressions that are ExprKind::Call or ExprKind::MethodCall -- but maybe the former actually includes anything of the form foo(bar)?...

EDIT: indeed it does. but I guess this is not actually the root cause of the problem, so we could keep it? Maybe the function/method restriction should not even be necessary, maybe we should just check that the place the expression is in requires an impl AsRef<Path>, be that place a parameter in a function call, or e.g. a RHS of an assignment, like

let _: Option<impl AsRef<Path>>: Some(Path::new("foo.txt"));

(assuming impl Trait in this position would be valid, but this is just an example)

let implements_asref_path = |arg| implements_trait(cx, arg, asref_def_id, &[path_ty.into()]);

if let ty::FnDef(..) | ty::FnPtr(..) = type_definition.kind() {
let parameters = type_definition.fn_sig(tcx).skip_binder().inputs();
Copy link
Contributor Author

@ada4a ada4a May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this whole thing a bit more.

As I said towards the end of this comment, what we want to check is whether a particular parameter of a function is impl AsRef<Path>, i.e. find a p such that fn foo(p: impl AsRef<Path>).

But the latter is just syntax sugar for fn foo<P: AsRef<Path>>(p: P), so we are in fact looking for a parameter whose type is defined in a generic type parameter of the function.

Therefore I think we shouldn't actually calling skip_binder here. Without that, I get the following chain:

  1. The type of type_definition.fn_sig(tcx) is PolyFnSig, which is an alias to Binder<'tcx, FnSig<'tcx>>
  2. that one has a Binder::inputs method, which returns a Binder<'tcx, &'tcx [Ty<'tcx>]>
  3. I can kind of iterate over that using Binder::iter, which finally gives me impl Iterator<Item=Binder<'tcx, Ty<'tcx>>>
  4. and now I'm stuck. What I think I have achieved at this point is turn a function like fn foo<P, T>(p: P, t: T, n: u32) into something like [ p<P,T>: P, t<P, T>: T, n<P, T>: u32 ] (syntax very much pseudocode-ish). So from there I need to first "shed" the unnecessary generics, and then?... No idea to be honest

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this slipped through, indeed Zulip would be a better choice for those kind of discussion. Have you read the documentation on ParamEnv? You could get the caller bounds that the caller has to adhere to for the parameter index you're interested in, and check whether the type of the argument would meet those bounds (so that for example a bound of impl AsRef<Path> + Clone would not receive an object which implements AsRef<Path> but is not Cloneable).

Also, you will have to check that the type of the parameter you're interested in does not appear in another parameter or a clause applying to another parameter type, as for example

fn foo<P: AsRef<Path>>(x: P, y: P) {}

must use the same type for both parameters, and you better let this untouched.

Does that help?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the whole Analysis section from the dev-guide seems to be exactly what I need! Thank you very much for this and all the other pointers, I'll read them and see what I come up with –and ask on Zulip if I don't understand something

@rustbot

This comment has been minimized.

@ada4a ada4a force-pushed the needless_path_new branch from a0b0291 to d6f2bc2 Compare June 2, 2025 12:05
@ada4a
Copy link
Contributor Author

ada4a commented Jun 2, 2025

@samueltardieu just pinging in case the notifications slipped through. If you can't/don't want to deal with this, I'll probably ask the Zulip folks for help again?

@rustbot

This comment has been minimized.

@ada4a ada4a force-pushed the needless_path_new branch from dfe7581 to c91bf12 Compare June 5, 2025 19:59
@ada4a ada4a force-pushed the needless_path_new branch from c91bf12 to 8915aff Compare June 20, 2025 23:13
@blyxyas
Copy link
Member

blyxyas commented Jun 24, 2025

Currently making the feature freeze feature and the performance project perform, I don't have the capacity right now to review this :/

r? clippy

@rustbot rustbot assigned Centri3 and unassigned blyxyas Jun 24, 2025
@ada4a
Copy link
Contributor Author

ada4a commented Jun 24, 2025

Oh hi! All good, I'm still very much not done with this 😅 We're slowly making progress over on Zulip: https://rust-lang.zulipchat.com/#narrow/channel/257328-clippy/topic/needless_path_new/with/525402801

@blyxyas
Copy link
Member

blyxyas commented Jun 24, 2025

I'm still very much not done with this

No problem, when you consider that you're in a ready-to-review state, you can mention the reviewer assigned to take a look

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status) and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties labels Jun 24, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 24, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

Copy link
Contributor

@Jarcho Jarcho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r? Jarcho

I can take over the review for this.

Comment on lines 43 to 52
ExprKind::Call(fn_expr, args) => {
check_arguments(cx, &mut args.iter(), cx.typeck_results().expr_ty(fn_expr));
},
ExprKind::MethodCall(_, receiver, arguments, _)
if let Some(def_id) = cx.typeck_results().type_dependent_def_id(e.hir_id) =>
{
let args = cx.typeck_results().node_args(e.hir_id);
let method_type = cx.tcx.type_of(def_id).instantiate(cx.tcx, args);
check_arguments(cx, &mut iter::once(receiver).chain(arguments.iter()), method_type);
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a strange way to get the signatures lined up with the parameters. What you want to do is something like:

let (fn_did, args) =  match e.kind {
    ExprKind::Call(callee, args)
        if let Res::Def(DefKind::Fn | DefKind::AssocFn, did) = path_res(cx, callee)
    => {
        (did, args)
    }
    ExprKind::MethodCall(_,_, args, _)
        if let Some(did) = cx.typeck_results().type_dependent_def_id(e.hir_id)
    => {
        (did, args)
    }
    _ => return,
};

let sig = cx.tcx.fn_sig(fn_did).skip_binder().skip_binder();
for (i, (arg_ty, arg)) in sig.inputs().iter().rev().zip(args.iter().rev()).enumerate() {
    // if you need the actual index
    let i = sig.inputs().len() - i - 1;
    // if you need the other tys in the sig
    let other_tys = sig.inputs_and_output[..i].iter().chain(sig.inputs_and_output[..i + 1].iter());
}

Note you can't actually lint on the method receiver in the first place since that would affect method lookup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of the weirdness has to do with the fact that I started this lint by copy-pasting mut_reference.rs -- I know realize that what I want to do is fairly different, but still, it was easier than starting from scratch 😅

Am I understanding correctly that the last for-loop is supposed to replace the one I currently have in check_arguments? And if so, why exactly would I want to rev here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and that double .skip_binder() looks a bit.. shady

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let other_tys = sig.inputs_and_output[..i].iter().chain(sig.inputs_and_output[..i + 1].iter());

You probably meant this?

let other_tys = sig.inputs_and_output[..i].iter().chain(sig.inputs_and_output[(i + 1)..].iter());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding correctly that the last for-loop is supposed to replace the one I currently have in check_arguments? And if so, why exactly would I want to rev here?

rev is to handle methods and functions on the same code path. Method calls don't have the receiver in the argument list so rev is used to line up the arguments with their respective types.

Oh and that double .skip_binder() looks a bit.. shady

This is the exact use case for skip_binder. We want to work on the function signature as written, not an instantiation of it.

You probably meant this?

Yes. That was a typo on my part.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use for_each instead of a for loop here since it will give the optimized version of the chained iterator loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let Res::Def(DefKind::Fn | DefKind::AssocFn, did) = path_res(cx, callee)

Previously, we had constructors of tuple structs or enum variants (it was Option::Some in particular) also slipping through and getting analyzed, which I wasn't expecting initially, but after some discussion we decided it wouldn't hurt.

Because of that, I'd be keen to also include DefKind::Ctor, and in particular DefKind::Ctor(_, CtorKind::Fn), because CtorKind::Const doesn't really fit

Comment on lines 78 to 87
let path_ty = {
let Some(path_def_id) = tcx.get_diagnostic_item(sym::Path) else {
return;
};
Ty::new_adt(tcx, tcx.adt_def(path_def_id), List::empty())
};

let Some(asref_def_id) = tcx.get_diagnostic_item(sym::AsRef) else {
return;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can all be done once when constructing the lint pass. The constructors for late lint passes have access to TyCtxt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, I forgot that I can have a constructor for the lint pass!

But I'm afraid it won't quite work? I looked at a few other lints whose structs had constructors that used TyCtxt, and none of them are fallible. And since my constructor would be, I think I wouldn't be able to call it inside register_late_pass here?

store.register_late_pass(|_| Box::new(needless_path_new::NeedlessPathNew));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, most lints don't bother with this at all, and effectively just inline implements_asref_path into the if-let chain:

if let ExprKind::Call(func, [arg]) = argument.kind
    && is_path_new(func)
    && let Some(asref_def_id) = tcx.get_diagnostic_item(sym::AsRef)
    && let Some(path_def_id) = tcx.get_diagnostic_item(sym::Path)
    && let path_ty = Ty::new_adt(tcx, tcx.adt_def(path_def_id), List::empty())
    && implements_trait(cx, cx.typeck_results().expr_ty(arg), asref_def_id, &[path_ty.into()])
    && implements_trait(cx, *parameter, asref_def_id, &[path_ty.into()])
{
    /* fire lint */
}

The only reason I didn't want to do that was because implements_asref_path is called in a loop in our case, so I wanted to extract avoid recomputing path_ty and asref_def_id every time. But it was said to me on Zulip that things like TyCtxt::get_diagnostic_item are fairly cheap, so it might be not worth the bother after all...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just store the Option<DefId> in the lint pass, no need to have a fallible constructor.

All the other lint passes really should be doing it once up front.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we'd still need to have the if let Some(asref_did) = self.asref_did... which is of course faster than TyCtxt::get_diagnostic_item, but still a bit annoying. Maybe fallible constructors aren't that bad of an idea after all...

But I'll go on and do the Option<DefId> thing for now

Copy link
Contributor Author

@ada4a ada4a Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 599b3d7 (I'll let you judge whether this is resolved)

the new NeedlessPathNew needed a 'tcx lifetime, and declare_lint_pass! didn't accept lifetimes, so I ended up switching to impl_lint_pass! instead

Comment on lines 100 to 109
let generic_args_we_can_change: Vec<_> = generic_args
.iter()
.filter_map(|g| g.as_type())
// if a generic is used in multiple places, we should better not touch it,
// since we'd need to suggest changing both parameters that using it at once,
// which might not be possible
.filter(|g| {
let inputs_and_output = fn_sig.inputs().iter().copied().chain([fn_sig.output()]);
inputs_and_output.filter(|i| i.contains(*g)).count() < 2
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better determined once we find a call to Path::new lines up with a generic argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this was just a WIP thing to see if it'll even work (and have something to link to for the folks on Zulip)

Comment on lines +125 to +126
&& implements_asref_path(cx.typeck_results().expr_ty(arg))
&& implements_asref_path(*parameter)
Copy link
Contributor

@Jarcho Jarcho Jun 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do the correct thing. What you want to check for is that the calling function has:

  • A generic parameter in the argument location
  • That generic has an AsRef bound and maybe a Sized bound.
  • No other bounds are applied to the generic
  • The generic isn't used as an equality bound
  • The generic doesn't appear anywhere else in the function's signature.

As a rough idea of what you want to do:

for (arg_ty, arg) in _ {
  if let ty::Param(arg_ty) = arg_ty.kind()
    && is_path_new(arg)
    && !is_used_elsewhere_sig(arg_ty.index, /* other sig tys */)
    && has_required_preds(arg_ty.index, cx.tcx.predicates_of(arg_ty))
  {}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A generic parameter in the argument location

What exactly do you mean by "generic parameter"? Is it GenericArg, a generic of a function, in this case a type parameter? Or a function parameter (Ty) which contains a generic (which is in turn defined in the function signature)?

What I tried doing is taking the former (stored as [GenericArg] in ExprKind::FnDef), and seeing if any of the parameters Ty::contains them -- but that didn't quite work out, see this message and the ones after that.

Also not sure what you meant by arg_ty.index -- probably the i from the for-loop shown in #14895 (comment)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly do you mean by "generic parameter"?

The argument has a generic parameter as it's entire type.

Also not sure what you meant by arg_ty.index

index is a field on ParamTy. Note the original arg_ty is a middle Ty you get from the function's signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait so you get the original arg_tys from tcx.fn_sig().inputs(), which, as I understand it, returns the input parameters of the function. So I thought that, if you had a function fn foo<T, U>(maybe_t: Option<T>, num: u32, u: U);, it would return [Option<T>, u32, U]. But what you're saying is that we would actually get [T, U, Option<T>, u32, U], and because of that would be able to then filter only the types that are purely ParamTys, so the first two (and I guess the last one?)? Because otherwise I don't understand how you would get a ParamTy from a Ty of a parameter, if the latter can:

  • contain a type parameter generic (Option<T>)
  • contain nothing but a type parameter generic (U)
  • contain no type parameter generic at all (u32)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. You are correct that the inputs in the function signature would be [Option<T>, u32, U]. The only parameter you can reasonably lint on is the third one so you filter to only work on those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, thank you!

Coming back to this one:

The generic doesn't appear anywhere else in the function's signature.

I'm still not sure how exactly I'd achieve that?..

You code shows this:

!is_used_elsewhere_sig(arg_ty.index, /* other sig tys */)

so apparently ParamTy.indexes are involved somehow, but it's my first time hearing about them, so I don't really know where/how I would use them? The documentation for both the struct and the field is... not comprehensive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The section on ty of the dev-guide looks relevant, reading it rn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading it, an approach that comes to mind is:
0. given a ParamTy.index and a list of parameters to be checked for not using the type parameter

  1. for each parameter:
  2. match on its TyKind
  3. if it's something that can have GenericArgs (Adt and FnDef AFAICT),
  4. check that GenericArgs doesn't contain a GenericArg with an index equal to TyParam.index

This seems fairly straightforward, but maybe there are some helper methods that could help shorten it even further?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's something that can have GenericArgs (Adt and FnDef AFAICT)

Ok so there are, in fact, more TyKinds that can have various kinds of generic args. I've been trying to match them all, and here's what it amounts to already:

sig.inputs_and_output.iter().flat_map(|param| match param.kind() {
    // `RawList<GenericArg>`
    ty::Adt(_, generic_args)
    | ty::FnDef(_, generic_args)
    | ty::Closure(_, generic_args)
    | ty::CoroutineClosure(_, generic_args)
    | ty::Coroutine(_, generic_args)
    | ty::CoroutineWitness(_, generic_args) => generic_args,
    ty::Alias(_, AliasTy { args }) => args,
    // `RawList<Ty>`
    ty::Tuple(tys) => tys,
    // `Ty`
    ty::Param(param_ty) => param_ty,
    _ => [],
})

I once again feel like I'm doing something very wrong here... I'll wait for your response before continuing with this

@rustbot rustbot assigned Jarcho and unassigned Centri3 Jun 25, 2025
ada4a added 27 commits June 26, 2025 22:37
we call `implements_asref_path` twice on each iteration of the loop, so
all the repeated creation could get expensive
it was recommended on Zulip to keep the check, so do that and give the
motivation for having it
apparently a common thing to do? definitely reduces verbosity
can revert this later if necessary
"that would affect method lookup"
to reduce the diff in the next commit
@ada4a ada4a force-pushed the needless_path_new branch from c18844d to 1f63044 Compare June 26, 2025 20:38
@ada4a ada4a force-pushed the needless_path_new branch from 99a3e7b to 599b3d7 Compare June 27, 2025 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action from the author. (Use `@rustbot ready` to update this status)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New lint: needless Path::new
6 participants