-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Implement a lint for implicit autoref of raw pointer dereference #103735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
b1f35e2
to
493b5fd
Compare
Haven't looked at the lint implementation yet, but the std changes seem great. :) And yeah I think this can't be more than warn-by-default as a start. |
Oh, nice, the compiler has places which trigger this lint too! |
This comment has been minimized.
This comment has been minimized.
r? @RalfJung |
Co-authored-by: klensy <[email protected]> Co-authored-by: Ralf Jung <[email protected]>
2371483
to
459d6ad
Compare
This comment has been minimized.
This comment has been minimized.
We discussed this in the lang meeting today, and agreed that the current lint is far more broad than we'd be willing to accept, especially as deny-by-default. We're interested in seeing less impactful versions, like were mentioned above
We were particularly thinking that something around addr_of_mut!((*ptr)[..layout_size]) example was particularly persuasive, much more so than the same expression outside that context. There were some vague ideas around maybe checking against a goal that the stack borrows state not be modified. But also a wish for a translation of our vague mental models into a nice specific writeup from the OpSem. (@rust-lang/lang, please add more if I forgot or mischaracterized something. There was lots of discussion today.) Personally, taking a quick look at the libs changes, stuff like this really doesn't seem like an improvement to me: - assert!((*tail).value.is_none());
+ assert!((&(*tail).value).is_none()); Though admittedly that's in part due to me knowing what those methods do. I'm not sure if there's some form of that intuition that could be formalized, like noting that they don't return borrows and thus the extra thing atop the borrow stack isn't a problem or something? (Or it's also possible that I'm wrong and those are a problem that really ought to be |
Indeed I think this lint should be warn-by-default anyway.
Yeah, the risky cases are
|
In current stacked borrows (which I presume you're alluding to here) what you're asking about doesn't exist. All reborrows change state. Reborrows of the topmost tag tend to be inconsequential, but detecting that code is working with the topmost tag at compile time sounds intractable. Perhaps there are a few very specific cases we could exclude from the lint, but at a glance I don't see them in the code above. (at one point I was very interested in certain code patterns that create inconsequential tags, because I hoped dealing with those could be an alternative to adding a GC to Miri)
I don't know if Ralf chose not to mention this because he intends to fix this in a new aliasing model, but here is an example of this code causing UB (adapted based on https://crates.io/crates/lathe/0.0.0): fn main() {
let mut b = Box::new(Some(0usize));
let raw = Box::into_raw(b);
unsafe {
let r = &*raw;
let ptr = raw as *const Option<usize>;
let new_box = Box::from_raw(ptr as *mut usize);
let z = (*ptr).is_none();
drop(new_box); // Call drop explicitly to make the error simpler
}
}
|
For operations that take an &T and actually read that entire T, whether or not we use a reference or raw ptr (same for &mut T and writing the entire T). That's why e.g. MaybeUninit::read/write probably should not get the lint.
Now, is_none doesn't read the entrie T so there are cases where a raw ptr version of it might make sense, but that seems like a niche case I would not focus on for a lint.
|
T-lang briefly discussed this today. We felt that for a proper discussion we want a summary that outlines:
#103735 (comment) laid out two cases, but it's not quite clear how they generalize to me. That comment defines the lint to be:
Is that entirely accurate? You mention (#103735 (comment)) that the risky cases are:
Reflecting a little, I think that part of the difficulty in the prior discussion was that I think the thing that would help most with this is focusing on the 3rd point above - what is the expected delta in user code after the lint? And perhaps some discussion of why a rule of |
I agree with comments above that the alternative code that is suggested is often not better. If we had postfix - unsafe { addr_of_mut!((*ptr)[..layout_size]) }
+ unsafe { addr_of_mut!((&mut *ptr)[..layout_size]) } something like this definitely is + #[allow(implicit_unsafe_autorefs)]
unsafe { addr_of_mut!((*ptr)[..layout_size]) } I also think we can do much better in targeting this lint only at cases that matter the most. Specifically, I'd like to suggest the following algorithm for determining when this should fire: We look for the following sequence:
The rationale for this is quite simple: a. In the case of place projections, we want to warn users because they created a I think this covers all of the important cases. These two examples from Ralf are covered: pub fn test(ptr: *mut [u8]) -> *mut [u8] {
let layout_size = 24;
unsafe { addr_of_mut!((*ptr)[..layout_size]) }
} pub struct Test {
data: [u8],
}
pub fn test_len(t: *const Test) -> usize {
unsafe { (*t).data.len() }
} Most of the other cases that this fires on (within this PR) are not covered though, and I think that's probably a good thing. I'd like to explicitly call out the fn main() {
let mut b = Box::new(Some(0usize));
let raw = Box::into_raw(b);
unsafe {
let r = &*raw;
let ptr = raw as *const Option<usize>;
let new_box = Box::from_raw(ptr as *mut usize);
let z = match *ptr {
Some(_) => false,
None => true,
};
drop(new_box); // Call drop explicitly to make the error simpler
}
} But that's still UB. The main point here is that the autoref part of the The Speaking a little more philosophically, I think the concern from T-lang was actually exactly right: There is very little that you can think you are doing with a This line of thought even yields another category of methods that we should apply the attribute to: Anything that looks like fn as_ptr(&self) -> *const Self {
self
} Here too, the user is likely to expect that they are only accessing the "pointer value" of List of methods in std that I know about that should get the annotation (will expand as I think of more):
Edit: The list was originally missing item c (the use core::ptr::addr_of;
use core::ops::Deref;
fn main() {
unsafe {
struct W<T>(T);
impl<T> Deref for W<T> {
type Target = T;
fn deref(&self) -> &T { &self.0 }
}
let w: W<i32> = W(5);
let w = addr_of!(w);
let p: *const i32 = addr_of!(**w); // LINT
}
} The user probably expects the line computing |
@JakobDegen well put, I have nothing to add. :) |
I'm going to un-nominate, since it looks like this has been discussed twice in the lang meeting without it getting un-nominated. Personally I things something along the lines of @JakobDegen's sketch sounds good. Please re-nominate if you'd like a quorum opinion on something here. |
Visiting for T-compiler triage. Based on discussion above, we do not think this is waiting-on-team any longer. Instead, the PR author should incorporate the feedback from @JakobDegen above. @rustbot label: -S-waiting-on-team +S-waiting-on-author |
@WaffleLapkin any updates on this? |
@RalfJung @JakobDegen just to be sure, you still agree with the suggestions from #103735 (comment) (for suggesting I think I finally have time to work on this. |
Yes that still sounds like a very good proposal to me. |
/// | ||
/// If you are sure, you can soundly take a reference, then you can take it explicitly: | ||
/// ```rust | ||
/// # use std::ptr::addr_of_mut; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// # use std::ptr::addr_of_mut; | |
/// use std::ptr::addr_of_mut; |
@WaffleLapkin any updates on this? |
@Dylan-DPC not much, I'm struggling to find to me work on this (and other PRs) :( |
closing in favor of #123239 |
… r=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung`
… r=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc ``@JakobDegen`` ``@WaffleLapkin`` r? ``@RalfJung``
… r=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc ```@JakobDegen``` ```@WaffleLapkin``` r? ```@RalfJung```
…=<try> Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung` try-job: dist-various-1 try-job: dist-various-2
…=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung` try-job: dist-various-1 try-job: dist-various-2
…=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung` try-job: dist-various-1 try-job: dist-various-2
…=jdonszelmann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on rust-lang#103735 with suggestion and improvements from rust-lang#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since rust-lang#103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung` try-job: dist-various-1 try-job: dist-various-2
…mann,traviscross Implement a lint for implicit autoref of raw pointer dereference - take 2 *[t-lang nomination comment](rust-lang/rust#123239 (comment) This PR aims at implementing a lint for implicit autoref of raw pointer dereference, it is based on #103735 with suggestion and improvements from rust-lang/rust#103735 (comment). The goal is to catch cases like this, where the user probably doesn't realise it just created a reference. ```rust pub struct Test { data: [u8], } pub fn test_len(t: *const Test) -> usize { unsafe { (*t).data.len() } // this calls <[T]>::len(&self) } ``` Since #103735 already went 2 times through T-lang, where they T-lang ended-up asking for a more restricted version (which is what this PR does), I would prefer this PR to be reviewed first before re-nominating it for T-lang. ---- Compared to the PR it is as based on, this PR adds 3 restrictions on the outer most expression, which must either be: 1. A deref followed by any non-deref place projection (that intermediate deref will typically be auto-inserted) 2. A method call annotated with `#[rustc_no_implicit_refs]`. 3. A deref followed by a `addr_of!` or `addr_of_mut!`. See bottom of post for details. There are several points that are not 100% clear to me when implementing the modifications: - ~~"4. Any number of automatically inserted deref/derefmut calls." I as never able to trigger this. Am I missing something?~~ Fixed - Are "index" and "field" enough? ---- cc `@JakobDegen` `@WaffleLapkin` r? `@RalfJung` try-job: dist-various-1 try-job: dist-various-2
This PR implements a
implicit_unsafe_autorefs
lint that checks for implicit auto-refs of pointer dereference. An example:I've made the lint deny-by-default, because this seems like an important footgun. However, given that even std had quite a few hits, maybe this should be warn-by-default.
Resolves #99437 (I think?)
r? compiler
cc @RalfJung