Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow deps to checkout unmerged commits #2922

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion apps/rebar/src/rebar_git_resource.erl
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,17 @@ git_clone(ref, _GitVsn, Url, Dir, Ref) ->
rebar_utils:escape_chars(Url),
rebar_utils:escape_chars(filename:basename(Dir))]),
[{cd, filename:dirname(Dir)}]),
rebar_utils:sh(?FMT("git checkout -q ~ts", [rebar_utils:escape_chars(Ref)]), [{cd, Dir}]);
case rebar_utils:sh(?FMT("git checkout -q ~ts", [rebar_utils:escape_chars(Ref)]),
[{cd, Dir}, return_on_error]) of
Copy link
Collaborator

@ferd ferd Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say I'm a bit confused by this.

Since this command directly follows a git clone, the refs should already be alright. Additionally, cloning the same repository twice would cause problems:

λ [vps] /tmp → git clone https://github.com/erlang/rebar3.git rebar3
Cloning into 'rebar3'...
remote: Enumerating objects: 23894, done.
remote: Counting objects: 100% (545/545), done.
remote: Compressing objects: 100% (278/278), done.
remote: Total 23894 (delta 231), reused 489 (delta 209), pack-reused 23349 (from 1)
Receiving objects: 100% (23894/23894), 7.90 MiB | 12.46 MiB/s, done.
Resolving deltas: 100% (16020/16020), done.
λ [vps] /tmp → git clone https://github.com/erlang/rebar3.git rebar3
fatal: destination path 'rebar3' already exists and is not an empty directory.

I'm struggling to find the specific scenario in which:

  • the clone works and therefore the repository is fresh (we tend to always do a fresh fetch and move the directory over)
  • but for which the checkout of the ref fails and a new git fresh happens to fix the checkout

How is this taking place exactly? I would expect only a few seconds if not minutes at most to happen between both commands (depending on repo size and network speed), so this seems like an odd situation where one would fetch a branch or tag that does not yet exist but is created during the repo download?

Since this is untested, it's hard to frame and figure out how to manually validate this situation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{deps, [
{abcd,
{git, "https://gerrit-hello.ca/a/kitty/kitty-abcd.git",
{ref, "fdf3bd64c113375399e570402be3864b539a0955"}}},
{bcde,
{git, "https://gerrit-hello.ca/a/kitty/kitty-bcde.git",
{ref, "225773cef58bdfab33457e421ae60c0aac0737a0"}}},
{cdef,
{git, "https://gerrit-hello.ca/a/kitty/kitty-cdef.git",
{ref, "361d0f873a1d3554104d0104873dcfb6dc614fc0"}}},
{escs,
{git, "https://gerrit-hello.ca/a/kitty/kitty-escs.git",
{ref, "39f13a1bd093974853314fb55c00f4cd71f0ab87"}}},
....

Scenario for the change:

For example, there is a parent project repository (sys) that depends on a set of submodules, each of which has its own repository. Each submodule has its own common test, and the parent project also has an integrated common test.
Let’s say a submodule "abcd" has a commit on its master branch (assume the commit SHA ID is "fdf3bd64...."). The submodule (abcd)'s commit triggers a CI integration test.
In the CI flow job (e.g., Jenkins), the parent project repository is pulled in clean workspace. At this point, the parent project’s rebar.config refers to the master branch refs (SHA) of each submodule. Then, the CI job modifies the rebar.config, replacing the SHA of submodule "abcd" with "fdf3bd64...". After fetching the dependencies of the parent project (sys), the integration test runs and provides a verdict for the corresponding commit of submodule "abcd".

Without the change, the parent project (sys)’s rebar3 deps would fail. This is because currently, rebar3 performs a git clone of "abcd" followed by git checkout "fdf3bd64...". However, the specific commit SHA ("fdf3bd64...") in submodule "abcd" does not exist in the history of the cloned master branch. As a result, git checkout "fdf3bd64..." would fail. This change attempts to fetch the SHA "fdf3bd64..." from the remote repository and then performs another git checkout "fdf3bd64...".

This change does not modify the original behavior of rebar3 (e.g., if the ref can already be found in the history of the local cloned repository).

This change is tested in a CI for the (sys) project.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, the CI job modifies the rebar.config, replacing the SHA of submodule "abcd" with "fdf3bd64...".

After this step, do you do a rebar3 upgrade abcd to make sure Rebar3 fetches the update and re-propagates the transitive dependency versions to the lock file, or is this skipped wholesale?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct: CI job modifies both rebar.config & rebar.lock in the parent project (sys)’s just after a clean clone of (sys) into Jenkins. then directly start "rebar3 compile".

Manual "rebar3 upgrade abcd" to update rebar.lock works fine.

[repo/gucliti/sys]$ ./rebar3 upgrade abcd
Building ...
===> Verifying dependencies...
Building ...
Building ...
===> Upgrading abcd (from {git,"https://gerrit-hello.ca/a/kitty/kitty-abcd.git",
{ref,"d6dd6e5d2c35db54e9b5a59b38a252a4123cc783"}})
===> No upgrade needed for ...
....
===> No upgrade needed for ...
===> No upgrade needed for ...
===> No upgrade needed for ...
[repo/gucliti/sys]$

Note: above is real log when obfuscated names. "d6dd6e5d..." is a SHA from an unmerged commit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should the CI steps just include a rebar3 upgrade dep1,dep2,dep3 step to avoid having to add specific workarounds for people modifying autogenerated lock files? Rebar3 maintains its own files and deps to avoid these inconsistencies, I don't know that it's the best strategy to pepper workarounds for manual changes here and there?

I'm still surprised the git clone step doesn't fail when the repo is already checked out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So should the CI steps just include a rebar3 upgrade dep1,dep2,dep3 step to avoid having to add specific workarounds for people modifying autogenerated lock files? Rebar3 maintains its own files and deps to avoid these inconsistencies, I don't know that it's the best strategy to pepper workarounds for manual changes here and there?

Mangling rebar3.config and rebar.lock after git clone are almost same for CI script, it knows origial SHA of "depx". Yes, CI can just updates rebar3.config, then rebar3 upgrade depx .

If not use modified version of rebar3, and only update the SHA (an unmerged commit) of the dependency module "depx" in rebar3.config, rebar3 upgrade depx will fail,
DIAGNOSTIC=1 rebar3 upgrade depx shows:
... fatal: reference is not a tree: d6dd6e5d...

However with the modified version of rebar3, it can successfully update rebar.lock with rebar3 upgrade depx.

The key point of this change is to support CI scenarios where a dependency that hasn't yet been merged into master needs to be used for integration testing for verdict on "depx".

I'm still surprised the git clone step doesn't fail when the repo is already checked out.

CI flow almost always pick one "delta" (a change in dep_x):
git clone sys in a new ephemera workspace, complile, test, and finally clear the workspace.

{ok, _} ->
ok;
{error, Reason} ->
?DEBUG("Initial git checkout failed for ref ~ts: ~p", [Ref, Reason]),
rebar_utils:sh(?FMT("git fetch origin ~ts", [rebar_utils:escape_chars(Ref)]),
[{cd, Dir}]),
rebar_utils:sh(?FMT("git checkout -q ~ts", [rebar_utils:escape_chars(Ref)]),
[{cd, Dir}])
end;
git_clone(rev, _GitVsn, Url, Dir, Rev) ->
rebar_utils:sh(?FMT("git clone ~ts -n ~ts ~ts",
[git_clone_options(),
Expand Down