-
Notifications
You must be signed in to change notification settings - Fork 2.8k
fix(info): resolve underscore vs hyphen mismatch in schema lookup #16455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
fix(info): resolve underscore vs hyphen mismatch in schema lookup #16455
Conversation
|
i will add one more commit for the functionality change, before that please check this code if it is good or not? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the test looks good and does what we want. Though rustfmt is failing so you'll need to fix that.
otherwise I think you should be good to start implementing the change :)
b577f22 to
d371702
Compare
This comment has been minimized.
This comment has been minimized.
d371702 to
e590ced
Compare
|
This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
|
From discussion in #general > writing test for cargo @ 💬
Currently Cargo will prioritize crates that match the current workspace's MSRV (ref) when selecting the package. Not sure how others feel about this behavior but I wanted to call this out during review. Edit: Apparently crates.io does not allow publishing crates with the same name as an existing crate with hyphens instead of underscores and vise versa. So my concern here might not be a problem in practice. Though unsure if we need to worry about 3rd party registries. I don't see this behavior specified in the Cargo registry API docs. |
ee9afc3 to
3e8367f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| snapbox::cmd::Command::cargo_ui() | ||
| .arg("info") | ||
| .arg("my_package") //underscore on purpose to show the error | ||
| .arg("--registry=dummy-registry") | ||
| .assert() | ||
| .failure() | ||
| .stdout_eq(file!["stdout1.term.svg"]) | ||
| .stderr_eq(file!["stderr1.term.svg"]); | ||
|
|
||
| snapbox::cmd::Command::cargo_ui() | ||
| .arg("info") | ||
| .arg("my-crate") //hyphen on purpose to show the error | ||
| .arg("--registry=dummy-registry") | ||
| .assert() | ||
| .failure() | ||
| .stdout_eq(file!["stdout2.term.svg"]) | ||
| .stderr_eq(file!["stderr2.term.svg"]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These seem like independent test cases, is there a reason they are in the same test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes they come under the same test for checking the normalization of crate name which checks _ ->- and -to -> _
They are very much similar work so i kept under the same test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally would have these as separate tests
- their fixtures are different
- their UI is being tested separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i can do that also. Should i go ahead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please split this test
src/cargo/core/package_id_spec.rs
Outdated
| self.matches_opts(package_id, false) | ||
| } | ||
|
|
||
| fn matches_opts(&self, package_id: PackageId, normalize_name: bool) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not thrilled with boolean parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this is for checking whether any normalization happened or not. Should I do something like a variable isnormalized_name or leave it like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there are two problems with this function
- the name does not convey what it does (it has opts, ok but what?)
- it takes a boolean which can make it unclear what this function does at the call site (
matches_opts(something, true), what does that do?)
One philosophy of API design is if a function's parameter is static then it should really be a separate function. We don't do that with query_vec, largely because that was how it started and it has a fairly deep stack for parallel calls.
Another approach for improving this is to have a QueryKind parameter which will make the bool's behavior clear.
In this case, I would likely just go with a unique function name though not too sure what I would call it at this point. Having "normalized" in there would help create a connection with QueryKind. So matches_normalized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense, I agree the boolean hurts readability. I’ll split this into a matches_normalized helper so the intent is clear at the call site.
src/cargo/core/package_id_spec.rs
Outdated
| self.matches_opts(package_id, false) | ||
| } | ||
|
|
||
| fn matches_opts(&self, package_id: PackageId, normalize_name: bool) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this? I see we do a .matches but I'm confused as to why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we do have .match but this handle it very differently and don't check for the normalised crate name like cargo info foo_bar but in registry it is present foo-bar so it will reject it but it should not since this is a minor error. This should run with a warning of converting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm wondering about is why the .match call is present. We already have a query_summaries before we find among them. Doesn't that do the filtering for us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes query_summary is present to find out the normalized one, so there is this line package_id if this is not present while checking the crate in the registry, it finds package_id in find_pkgid_in_summaries here, we do filtering with the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think what is happening is query_summaries is returning all summaries for a source + name. In find_pkgid_in_summaries we are then filtering by version within the spec but also running into the problem that the name didn't match exactly.
Maybe what we should do is check what name was returned in the summaries (they should all be the same) and rewrite the spec to use the normalized name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we rewrite the spec that would be great, i thought of it initially, i was unsure that is it a good thing to change the spec to use the nomalized name then all these functions and would not be required to create. Just change the QueryKind::Normalized and update the spec; that's it, no changes are required elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which one should i go with making of matches_normalized one or changing the spec name parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I lean towards changing the spec as it is the least intrusive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the spec is of type &PackageIdSpec which is not mutable either we have to change the mutability at the call or we have to initailze a new PackageIdSpec for it clone the other things then pass to the find_pkgid_in_summaries
i am not sure to whether i should change the mutability or not? for now i going with the clone.
|
I just renamed the PR title, as we usually only include one type in the Conventional Message. |
src/cargo/ops/registry/info/mod.rs
Outdated
|
|
||
| let summaries = query_summaries(spec, &mut registry, &source_ids)?; | ||
| let (summaries, normalized_name) = query_summaries(spec, &mut registry, &source_ids)?; | ||
| let new_spec = if let Some(name) = normalized_name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_spec -> normalized_spec
src/cargo/ops/registry/info/mod.rs
Outdated
| let new_spec = if let Some(name) = normalized_name { | ||
| if name != spec.name() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these ifs be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remember to clean up the commits to reflect what should be reviewed and merged and not the development process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything else is also required?
11b9042 to
dea0635
Compare
What does this PR try to resolve?
This PR makes
cargo infohandle crate names with underscores (_) the same way as cargo add, by translating underscores to hyphens (-) and emitting a warning and also the other way around.Currently,
cargo info my_packageandcargo info my-cratefails even when the actual crate name ismy-packageandmy_craterespectively, which is surprising and inconsistent with other Cargo commands. This change aligns cargo info withexisting behavior and improves usability.
Fixes #16442
How to test and review this PR?
Run the added tests:
cargo test --test testsuite -- cargo_info::crate_name_normalization_from_hyphen_to_underscorecargo test --test testsuite -- cargo_info::crate_name_normalization_from_underscore_to_hyphenManually verify behavior:
cargo info esp_printlncargo info big-sThis should succeed, show a warning about translating
_to-and also-to_, and display the crate info.The change includes a test that captures the previous failing behavior and verifies the new expected output.