-
Notifications
You must be signed in to change notification settings - Fork 1.7k
CI: add clippy::needless_pass_by_value rule
#18468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @2010YOUY01 -- I think this looks like an improvement to me
If we are going to add this lint I think we should also update the various APIs to pass (not just add #[allow(clippy::needless_pass_by_value)]
If this PR makes sense, we can open a tracking issue and roll out this check to the remaining workspace packages. At least this can help prevent new inefficient patterns and identify existing issues that we can fix gradually.
I think it makes a lot of sense
| use std::sync::Arc; | ||
|
|
||
| #[allow(clippy::needless_pass_by_value)] | ||
| fn create_qualified_schema(qualifier: &str, names: Vec<&str>) -> Result<DFSchema> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could change this to impl IntoIterator<Item = &str> for example
| fn create_qualified_schema(qualifier: &str, names: Vec<&str>) -> Result<DFSchema> { | |
| fn create_qualified_schema(qualifier: &str, names: impl IntoIterator<Item = &str>) -> Result<DFSchema> { |
datafusion/common/src/hash_utils.rs
Outdated
| /// If `rehash==true` this combines the previous hash value in the buffer | ||
| /// with the new hash using `combine_hashes` | ||
| #[cfg(not(feature = "force_hash_collisions"))] | ||
| #[allow(clippy::needless_pass_by_value)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? What is clippy's alternate suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I am wondering if this has found a good potential for optimization...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arg array can safely be taken by reference.
However, I believe all current call sites already pass it by moving, so they’re fine.
That said, there might be optimization opportunities elsewhere — this needless_pass_by_value warning could indicate that some callers are cloning unnecessarily just to make the rust compiler happy.
ci/scripts/rust_clippy.sh
Outdated
| cargo clippy --all-targets --workspace --features avro,pyarrow,integration-tests,extended_tests -- -D warnings | ||
|
|
||
| # Update packages incrementally for stricter Clippy checks | ||
| # TODO: add tracking issue for the remaining workspace packages like `datafusion-catalog` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better to add at the module level (rather than in the CI script) so that it would also be flagged locally when people ran clippy.
Similar to this:
datafusion/datafusion/common/src/lib.rs
Line 25 in e4f2b49
| #![deny(clippy::clone_on_ref_ptr)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reverted the CI script, and update the datafusion/common/cargo.toml instead in 2736b39, since we can update the whole datafusion-common crate at once.
When updating a larger crate, I think it's possible to configure like this to update module by module.
Thank you for the feedbacks! For the following cases, I kept the
For future works, I think we could start with crates that are performance critical or currently suffering from unnecessary clones. |
One place is the logical optimizer I think |
Which issue does this PR close?
An initial attempt towards #18467
Rationale for this change
Rationale for the additional lint rule
clippy::needless_pass_by_valueThere is a clippy lint rule that is not turned on by the current strictness level in CI: https://rust-lang.github.io/rust-clippy/master/index.html#needless_pass_by_value
Note it has the
Clippycategorypedantic, and its description islints which are rather strict or have occasional false positivesfrom https://doc.rust-lang.org/nightly/clippyIt seems we have been suffering from the excessive copying issue for quite some time, and @alamb is on the front line now #18413. I think this extra lint rule is able to help.
Implementation plan
This PR only enables this rule in
datafusion-commonpackage, and apply#[allow(clippy::needless_pass_by_value)]for all violations.If this PR makes sense, we can open a tracking issue and roll out this check to the remaining workspace packages. At least this can help prevent new inefficient patterns and identify existing issues that we can fix gradually.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?