Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the :rev(...) filter #965

Merged
merged 1 commit into from
Nov 15, 2022

Conversation

christian-schilling
Copy link
Member

@christian-schilling christian-schilling commented Oct 3, 2022

This allows selectively applying filtering rules to different parts of the history by providing a sha -> filter mapping as an argument.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 3, 2022

I am trying this on Miri, and something doesn't seem right...

git fetch http://localhost:8000/rust-lang/rust.git:start=75dd959a3a40eb5b4574f8d2e23aa6efbeb33573[:prefix=src/tools/miri]:/src/tools/miri.git master

I started this a few hours ago and completely forgot about it, and it is still running. Clearly it is not going anywhere.

The corresponding command with #961 is

git fetch http://localhost:8000/rust-lang/rust.git:subtree_prefix=75dd959a3a40eb5b4574f8d2e23aa6efbeb33573,src/tools/miri:/src/tools/miri.git master

and it completes in around 10min on an empty cache.

@christian-schilling
Copy link
Member Author

christian-schilling commented Oct 5, 2022

I started this a few hours ago and completely forgot about it, and it is still running. Clearly it is not going anywhere.

Found the bug, fixed in newest version of this PR.

The reason was in contrast to the test case, in the real world you used a chained filter (:prefix=a/b is internally converted to :prefix=b:prefix=a) and those chained filters are not cached directly. The implementation relied on finding the startfilter in the cache, which never happened for such a filter.

Now this should work (adapted test case, and tried it on miri as well).

The only question remaining for me is if start is really the final name for this filter ;)

@RalfJung
Copy link
Contributor

RalfJung commented Oct 5, 2022

Yeah I was wondering about that name. :if_ancestor_of=sha[filter] or something like that maybe? I feel like it should invoke a conditional, to make clear that it doesn't just extract that subhistory.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 5, 2022

The new branch looks great. :)

I tried pulling Miri from Rust, and the history contained commit 338c50e396f31c3970ab4ae7bc5d79807c6c033b which is what I extracted with my branch.

I tried pushing Miri to Rust, and got commit 6f2b52ff103f9095fb2817637bc7d388bd4c7f6b, the exact same as with by branch. This includes the same strange choice of base commit, 8e3b9bca65d7d79a3b0e9c33fed8d8c93dd66041. Pushing was a lot faster than with my branch, probably I didn't wire up some caching correctly.

@christian-schilling christian-schilling changed the title Implement the :start=sha[...] filter Implement the :at_commit=sha[...] filter Oct 6, 2022
@RalfJung
Copy link
Contributor

RalfJung commented Oct 7, 2022

at_commit seems fine, though I would think that this applies only to that commit, not to the entire subhistory starting at that commit.

@christian-schilling
Copy link
Member Author

christian-schilling commented Oct 8, 2022

though I would think that this applies only to that commit, not to the entire subhistory starting at that commit.

It does only apply to that one specific commit, not the subhistory before it. That was actually a bug in the other implementation. Consider a history like this:

* A
|  * B
|  * C
| /
* D
* E

And the filter :at_commit=C[:filter]
Now how will the filtered versions A'and B' of A and B look like?
The filtered history of A will be just A - D - E while B will be B - C' - D' - E'.
The previous implementation would, if B was filtered first insert filtered commits for D end E into the cache and then deliver wrong results for A.

So the presence of C further up the history does not in any case change how D and E are filtered. It only affects C. However because C is "substituted" with C' and the parent of that is D' it looks like the previous history was filtered. This is kind of "switching lanes" and not like actually changing the filter on the same lane.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 8, 2022

It does only apply to that one specific commit, not the subhistory before it.

But what about :at_commit=B[:filter]? Both B and C are being filtered now. So it's definitely not just at that commit.

It is true that I did not consider the situation where the subtree "merges back" into the main history. For the use-cases I can imagine for this filter, that will never happen. But it is also true that more than 1 commit is being filtered -- all the commits of the git subtree-imported Miri history in rustc are being filtered.

@christian-schilling
Copy link
Member Author

Yes, they are being filtered kind of as a "dependency" of the at_commit filter. Even on a perfectly linear history, the filtered versions of the commits before the "chosen" will not look filtered if you ask for them directly but you will get a history that includes the filtered versions if you ask for a commit further up.

The reason I have not merged this yet, is because I keep thinking about what to do in case you want multiple conditional filter operations in combination. Chaining will not really work because you don't know the sha in advance.
I'm considering a filter that accepts multiple (sha, filter) pairs, but for that the grammar will need to be extended even more. 🤔
Might still be worth it though.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 8, 2022

Yes, they are being filtered kind of as a "dependency" of the at_commit filter. Even on a perfectly linear history, the filtered versions of the commits before the "chosen" will not look filtered if you ask for them directly but you will get a history that includes the filtered versions if you ask for a commit further up.

Okay that is pretty confusing. From a user perspective it sure looks like the filter just applies to all these commits.

(Also I think in my branch, looking at the earlier commits would have looked filtered, if the "chosen" commit already exists so that the ancestor check applies.)

The reason I have not merged this yet, is because I keep thinking about what to do in case you want multiple conditional filter operations in combination. Chaining will not really work because you don't know the sha in advance.

Yeah I wondered about that too -- it might happen in rustc if other tools also switch to josh. My implementation propagated the same commit id to every filter in the chain so I think it would have worked with chaining, I have no idea what yours will do when at_commit is not the first filter in the chain. The intermediate trees are not even turned into commits, are they? So there's no commit id to compare against?

@christian-schilling
Copy link
Member Author

Propagating the original commit down the chain would require also adding that to the key used for caching. Making the cache a lot bigger potentially and also making hits less likely.
It would break some of the basic ideas that make josh as efficient as it es today.

So I think in the end this special filter will have to be that: special. We already have that in another case: The workspace filter can only be in the top level chain and not anywhere deeper.

This will be similar.

However I think I want to solve the "multiple chosen commits" situation before settling on an API for this.

I'm confident I though that shas will remain compatible so you can already use this as is if you don't mind running a non released version for some time.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 8, 2022

Yeah I can see how propagating the commit could break a data model.

I'm confident I though that shas will remain compatible so you can already use this as is if you don't mind running a non released version for some time.

Sounds good, thanks!

@RalfJung

This comment was marked as resolved.

@RalfJung
Copy link
Contributor

RalfJung commented Oct 9, 2022

Oh, I think I was on an outdated version of this branch. I guess the remote: parse_item: no match meant that I had an error in my filter expression. I wasn't sure whether that error came from git or josh.

@christian-schilling
Copy link
Member Author

So it works now?

@RalfJung
Copy link
Contributor

RalfJung commented Oct 9, 2022 via email

@christian-schilling christian-schilling force-pushed the @changes/master/[email protected]/start-filter branch 3 times, most recently from c8084be to 5415575 Compare October 11, 2022 12:21
@christian-schilling christian-schilling marked this pull request as draft October 12, 2022 11:12
@RalfJung
Copy link
Contributor

RalfJung commented Oct 21, 2022

Strangely, I am now getting an error when trying to push to this filter. The same thing worked fine a few times before, but now when I try to push the latest bit of Miri history it fails.

From a clone of https://github.com/rust-lang/miri, I am doing

git push http://localhost:8000/RalfJung/rust.git:at_commit=75dd959a3a40eb5b4574f8d2e23aa6efbeb33573[:prefix=src/tools/miri]:/src/tools/miri.git -o base=master HEAD:miri

In RalfJung/rust, miri does not exist and master is b1ab3b7, the same commit that the last josh pull was made from.

Pushing then shows

remote: josh-proxy
remote: response from upstream:
remote: rejecting merge with 2 parents:
remote: "Auto merge of #2583 - RalfJung:rustup, r=oli-obk"

That can only be referring to 85f449aafb9c9a522e03b9d9a05e0d7b47674586, which was already present in the last mergers and did not cause problems before. I don't understand what is different now.

@RalfJung
Copy link
Contributor

FWIW I am getting the same error with my own subtree_filter branch. I am sure I tested something like this before we decided to land the josh changes, but clearly what I tested was not representative for an actual push with real changes in both repos. (I have first synced the monorepo changes into the subrepo, and now I am failing to push the resulting sync back to the monorepo.)

@christian-schilling christian-schilling force-pushed the @changes/master/[email protected]/start-filter branch 6 times, most recently from 5f01c01 to e48826e Compare November 14, 2022 15:20
@christian-schilling christian-schilling changed the title Implement the :at_commit=sha[...] filter Implement the :rev(...) filter Nov 14, 2022
@christian-schilling christian-schilling marked this pull request as ready for review November 14, 2022 15:22
docs/src/reference/filters.md Outdated Show resolved Hide resolved
@christian-schilling christian-schilling force-pushed the @changes/master/[email protected]/start-filter branch 6 times, most recently from f232237 to 64532cf Compare November 15, 2022 15:05
Change: start-filter
@LMG LMG merged commit e1d10b6 into master Nov 15, 2022
@LMG LMG deleted the @changes/master/[email protected]/start-filter branch November 15, 2022 15:31
bors added a commit to rust-lang/miri that referenced this pull request Nov 15, 2022
update josh instructions

josh-project/josh#965 and josh-project/josh#994 have been merged so we don't need a forked josh any more. :)

However, this is blocked on josh-project/josh#1032 which currently prevents me from actually testing this...
bors added a commit to rust-lang/miri that referenced this pull request Nov 15, 2022
update josh instructions

josh-project/josh#965 and josh-project/josh#994 have been merged so we don't need a forked josh any more. :)

However, this is blocked on josh-project/josh#1032 which currently prevents me from actually testing this...
RalfJung pushed a commit to RalfJung/rust that referenced this pull request Nov 15, 2022
update josh instructions

josh-project/josh#965 and josh-project/josh#994 have been merged so we don't need a forked josh any more. :)

However, this is blocked on josh-project/josh#1032 which currently prevents me from actually testing this...
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Nov 17, 2022
update josh instructions

josh-project/josh#965 and josh-project/josh#994 have been merged so we don't need a forked josh any more. :)

However, this is blocked on josh-project/josh#1032 which currently prevents me from actually testing this...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants