Skip to content

CodeQL-inspired fixes #1891

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
2 changes: 1 addition & 1 deletion builtin/commit.c
Original file line number Diff line number Diff line change
Expand Up @@ -1022,7 +1022,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
for (i = 0; i < the_repository->index->cache_nr; i++)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:39PM +0000, Johannes Schindelin via GitGitGadget wrote:

> diff --git a/builtin/commit.c b/builtin/commit.c
> index 66bd91fd523d..fba0dded64a7 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1022,7 +1022,7 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
>  			for (i = 0; i < the_repository->index->cache_nr; i++)
>  				if (ce_intent_to_add(the_repository->index->cache[i]))
>  					ita_nr++;
> -			committable = the_repository->index->cache_nr - ita_nr > 0;
> +			committable = the_repository->index->cache_nr > ita_nr;

I guess it is not possible for ita_nr to be greater than cache_nr, since
we are counting up entries in the loop above. If ita_nr were greater,
the original would wrap around and set committable to true, but yours
would not.

So really, I think the original was equivalent to:

  committable = cache_nr != ita_nr;

but I think ">" probably expresses the intent better (we want to know if
there are any non-ita entries). Though in that case I'd think:

  committable = 0;
  for (i = 0; i < cache_nr; i++) {
	if (!ce_intent_to_add(...) {
		committable = 1;
		break;
	}
  }

would be the most clear, since we do not otherwise care about the actual
number of ita entries. And lets us break out of the loop early.

I dunno if it is worth refactoring further, though. Your patch does the
correct thing and fixes the codeql complaint (which I do think is a
false positive, because ita_nr must be less than cache_nr).

-Peff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <[email protected]> writes:

> ... there are any non-ita entries). Though in that case I'd think:
>
>   committable = 0;
>   for (i = 0; i < cache_nr; i++) {
> 	if (!ce_intent_to_add(...) {
> 		committable = 1;
> 		break;
> 	}
>   }
>
> would be the most clear, since we do not otherwise care about the actual
> number of ita entries. And lets us break out of the loop early.

Exactly.  If you focus on the warning too narrowly, the minimal
change in the original patch does look OK, but in the original (even
before Dscho's patch, that is) the intent is unclear, as opposed to
what you showed above.  And the update to squelch false positive
does not improve the clarity of the logic as the above rewrite does.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:37:00PM -0700, Junio C Hamano wrote:

> Jeff King <[email protected]> writes:
> 
> > ... there are any non-ita entries). Though in that case I'd think:
> >
> >   committable = 0;
> >   for (i = 0; i < cache_nr; i++) {
> > 	if (!ce_intent_to_add(...) {
> > 		committable = 1;
> > 		break;
> > 	}
> >   }
> >
> > would be the most clear, since we do not otherwise care about the actual
> > number of ita entries. And lets us break out of the loop early.
> 
> Exactly.  If you focus on the warning too narrowly, the minimal
> change in the original patch does look OK, but in the original (even
> before Dscho's patch, that is) the intent is unclear, as opposed to
> what you showed above.  And the update to squelch false positive
> does not improve the clarity of the logic as the above rewrite does.

OK. If we do want to refactor, I think pulling it into a separate
function is the most descriptive, like:

diff --git a/builtin/commit.c b/builtin/commit.c
index 66bd91fd52..a8d43d223d 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -740,6 +740,15 @@ static void change_data_free(void *util, const char *str UNUSED)
 	free(d);
 }
 
+static int has_non_ita_entries(struct index_state *index)
+{
+	int i;
+	for (i = 0; i < index->cache_nr; i++)
+		if (!ce_intent_to_add(index->cache[i]))
+			return 1;
+	return 0;
+}
+
 static int prepare_to_commit(const char *index_file, const char *prefix,
 			     struct commit *current_head,
 			     struct wt_status *s,
@@ -1015,14 +1024,10 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
 			parent = "HEAD^1";
 
 		if (repo_get_oid(the_repository, parent, &oid)) {
-			int i, ita_nr = 0;
-
 			/* TODO: audit for interaction with sparse-index. */
 			ensure_full_index(the_repository->index);
-			for (i = 0; i < the_repository->index->cache_nr; i++)
-				if (ce_intent_to_add(the_repository->index->cache[i]))
-					ita_nr++;
-			committable = the_repository->index->cache_nr - ita_nr > 0;
+			committable =
+				has_non_ita_entries(the_repository->index);
 		} else {
 			/*
 			 * Unless the user did explicitly request a submodule

-Peff

if (ce_intent_to_add(the_repository->index->cache[i]))
ita_nr++;
committable = the_repository->index->cache_nr - ita_nr > 0;
committable = the_repository->index->cache_nr > ita_nr;
} else {
/*
* Unless the user did explicitly request a submodule
Expand Down
3 changes: 2 additions & 1 deletion builtin/fetch.c
Original file line number Diff line number Diff line change
Expand Up @@ -1728,7 +1728,7 @@ static int do_fetch(struct transport *transport,
if (transport->remote->follow_remote_head != FOLLOW_REMOTE_NEVER)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:44PM +0000, Johannes Schindelin via GitGitGadget wrote:

> As pointed out by CodeQL, `branch_get()` may return `NULL`, in which
> case `branch_has_merge_config()` would return early, but we can even
> avoid enumerating the refs prefixes in that case, saving even more CPU
> cycles.

I am not sure how this patch changes anything with respect to CPU. If
branch is NULL, then branch_has_merge_config(branch) will always return
false.

I think this is just an issue that CodeQL is not looking inside
branch_has_merge_config(), and thus does not realize we will never hit
the rest of the short-circuit conditional (let alone the body) in that
case?

Still may be worth dealing with, but it makes me a little sad to have to
add an extra redundant check (one place is not a big deal, but as a
general pattern).

-Peff

do_set_head = 1;
}
if (branch_has_merge_config(branch) &&
if (branch && branch_has_merge_config(branch) &&
!strcmp(branch->remote_name, transport->remote->name)) {
int i;
for (i = 0; i < branch->merge_nr; i++) {
Expand Down Expand Up @@ -2560,6 +2560,7 @@ int cmd_fetch(int argc,
if (server_options.nr)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:40PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <[email protected]>
> 
> As pointed out by CodeQL, it is a potentially dangerous practice to
> store local variables' addresses in non-local structs. Yet this is
> exactly what happens with the `acked_commits` attribute that is used in
> `cmd_fetch()`: The pointer to a local variable is assigned to it.
> 
> Now, it is Git's convention that `cmd_*()` functions are essentially
> only returning just before exiting the process, therefore there is
> little danger that this attribute is used after the code flow returns
> from that function.

I was going to say: the real sin here is using a global variable in the
first place, without which gtransport would not survive outside of
cmd_fetch(). But the issue is even worse than that. The acked_commits
variable is inside a conditional block, so the address is stale for the
rest of cmd_fetch(), too!

It doesn't look like we ever examine it after that, but it's hard to
trace, since it's a global. ;)

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index cda6eaf1fd6e..c1a1434c7096 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -2560,6 +2560,7 @@ int cmd_fetch(int argc,
>  		if (server_options.nr)
>  			gtransport->server_options = &server_options;
>  		result = transport_fetch_refs(gtransport, NULL);
> +		gtransport->smart_options->acked_commits = NULL;
>  
>  		oidset_iter_init(&acked_commits, &iter);
>  		while ((oid = oidset_iter_next(&iter)))

Here you unset it within that conditional block, which is the right
spot. Looks good.

-Peff

gtransport->server_options = &server_options;
result = transport_fetch_refs(gtransport, NULL);
gtransport->smart_options->acked_commits = NULL;

oidset_iter_init(&acked_commits, &iter);
while ((oid = oidset_iter_next(&iter)))
Expand Down
141 changes: 69 additions & 72 deletions commit-graph.c
Original file line number Diff line number Diff line change
Expand Up @@ -2509,7 +2509,17 @@ int write_commit_graph(struct object_directory *odb,
const struct commit_graph_opts *opts)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:41PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <[email protected]>
> 
> We do need a context to write the commit graph, but that context is only
> needed during the life time of `commit_graph_write()`, therefore it can
> easily be a stack variable.

Yay. I am in favor of using stack variables when possible as a general
rule.

> diff --git a/commit-graph.c b/commit-graph.c
> index 6394752b0b08..9f0115dac9b5 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2509,7 +2509,17 @@ int write_commit_graph(struct object_directory *odb,
>  		       const struct commit_graph_opts *opts)
>  {
>  	struct repository *r = the_repository;
> -	struct write_commit_graph_context *ctx;
> +	struct write_commit_graph_context ctx = {
> +		.r = r,
> +		.odb = odb,
> +		.append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0,
> +		.report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0,
> +		.split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0,
> +		.opts = opts,
> +		.total_bloom_filter_data_size = 0,
> +		.write_generation_data = (get_configured_generation_version(r) == 2),
> +		.num_generation_data_overflows = 0,
> +	};
>  	uint32_t i;
>  	int res = 0;
>  	int replace = 0;
> @@ -2531,17 +2541,6 @@ int write_commit_graph(struct object_directory *odb,
>  		return 0;
>  	}
>  
> -	CALLOC_ARRAY(ctx, 1);
> -	ctx->r = r;
> -	ctx->odb = odb;
> -	ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
> -	ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
> -	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
> -	ctx->opts = opts;
> -	ctx->total_bloom_filter_data_size = 0;
> -	ctx->write_generation_data = (get_configured_generation_version(r) == 2);
> -	ctx->num_generation_data_overflows = 0;

OK, this moves the initialization to the top of the function. So to
review this for correctness, we must make sure that we do not change the
values of any of those variables between the two spots (i.e., in the
diff context that is omitted).

Most of it looks fine. Our call to get_configured_generation_version()
now happens earlier, before the call to prepare_repo_settings(). I think
that is OK, because the former calls repo_config_get_int() directly. It
does seem like a potential maintenance problem if that call is ever
rolled into prepare_repo_settings().

So maybe OK, but the smaller change would be to just replace the calloc
with a memset(), and s/->/./ on the subsequent lines.

-Peff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <[email protected]> writes:

> So maybe OK, but the smaller change would be to just replace the calloc
> with a memset(), and s/->/./ on the subsequent lines.

True, and it would be a bit easier to merge with other topics in
flight.  The .oid member and parameter are both renamed IIUC.

{
struct repository *r = the_repository;
struct write_commit_graph_context *ctx;
struct write_commit_graph_context ctx = {
.r = r,
.odb = odb,
.append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0,
.report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0,
.split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0,
.opts = opts,
.total_bloom_filter_data_size = 0,
.write_generation_data = (get_configured_generation_version(r) == 2),
.num_generation_data_overflows = 0,
};
uint32_t i;
int res = 0;
int replace = 0;
Expand All @@ -2531,32 +2541,21 @@ int write_commit_graph(struct object_directory *odb,
return 0;
}

CALLOC_ARRAY(ctx, 1);
ctx->r = r;
ctx->odb = odb;
ctx->append = flags & COMMIT_GRAPH_WRITE_APPEND ? 1 : 0;
ctx->report_progress = flags & COMMIT_GRAPH_WRITE_PROGRESS ? 1 : 0;
ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
ctx->opts = opts;
ctx->total_bloom_filter_data_size = 0;
ctx->write_generation_data = (get_configured_generation_version(r) == 2);
ctx->num_generation_data_overflows = 0;

bloom_settings.hash_version = r->settings.commit_graph_changed_paths_version;
bloom_settings.bits_per_entry = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_BITS_PER_ENTRY",
bloom_settings.bits_per_entry);
bloom_settings.num_hashes = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_NUM_HASHES",
bloom_settings.num_hashes);
bloom_settings.max_changed_paths = git_env_ulong("GIT_TEST_BLOOM_SETTINGS_MAX_CHANGED_PATHS",
bloom_settings.max_changed_paths);
ctx->bloom_settings = &bloom_settings;
ctx.bloom_settings = &bloom_settings;

init_topo_level_slab(&topo_levels);
ctx->topo_levels = &topo_levels;
ctx.topo_levels = &topo_levels;

prepare_commit_graph(ctx->r);
if (ctx->r->objects->commit_graph) {
struct commit_graph *g = ctx->r->objects->commit_graph;
prepare_commit_graph(ctx.r);
if (ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;

while (g) {
g->topo_levels = &topo_levels;
Expand All @@ -2565,15 +2564,15 @@ int write_commit_graph(struct object_directory *odb,
}

if (flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS)
ctx->changed_paths = 1;
ctx.changed_paths = 1;
if (!(flags & COMMIT_GRAPH_NO_WRITE_BLOOM_FILTERS)) {
struct commit_graph *g;

g = ctx->r->objects->commit_graph;
g = ctx.r->objects->commit_graph;

/* We have changed-paths already. Keep them in the next graph */
if (g && g->bloom_filter_settings) {
ctx->changed_paths = 1;
ctx.changed_paths = 1;

/* don't propagate the hash_version unless unspecified */
if (bloom_settings.hash_version == -1)
Expand All @@ -2586,116 +2585,114 @@ int write_commit_graph(struct object_directory *odb,

bloom_settings.hash_version = bloom_settings.hash_version == 2 ? 2 : 1;

if (ctx->split) {
struct commit_graph *g = ctx->r->objects->commit_graph;
if (ctx.split) {
struct commit_graph *g = ctx.r->objects->commit_graph;

while (g) {
ctx->num_commit_graphs_before++;
ctx.num_commit_graphs_before++;
g = g->base_graph;
}

if (ctx->num_commit_graphs_before) {
ALLOC_ARRAY(ctx->commit_graph_filenames_before, ctx->num_commit_graphs_before);
i = ctx->num_commit_graphs_before;
g = ctx->r->objects->commit_graph;
if (ctx.num_commit_graphs_before) {
ALLOC_ARRAY(ctx.commit_graph_filenames_before, ctx.num_commit_graphs_before);
i = ctx.num_commit_graphs_before;
g = ctx.r->objects->commit_graph;

while (g) {
ctx->commit_graph_filenames_before[--i] = xstrdup(g->filename);
ctx.commit_graph_filenames_before[--i] = xstrdup(g->filename);
g = g->base_graph;
}
}

if (ctx->opts)
replace = ctx->opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
if (ctx.opts)
replace = ctx.opts->split_flags & COMMIT_GRAPH_SPLIT_REPLACE;
}

ctx->approx_nr_objects = repo_approximate_object_count(the_repository);
ctx.approx_nr_objects = repo_approximate_object_count(the_repository);

if (ctx->append && ctx->r->objects->commit_graph) {
struct commit_graph *g = ctx->r->objects->commit_graph;
if (ctx.append && ctx.r->objects->commit_graph) {
struct commit_graph *g = ctx.r->objects->commit_graph;
for (i = 0; i < g->num_commits; i++) {
struct object_id oid;
oidread(&oid, g->chunk_oid_lookup + st_mult(g->hash_len, i),
the_repository->hash_algo);
oid_array_append(&ctx->oids, &oid);
oid_array_append(&ctx.oids, &oid);
}
}

if (pack_indexes) {
ctx->order_by_pack = 1;
if ((res = fill_oids_from_packs(ctx, pack_indexes)))
ctx.order_by_pack = 1;
if ((res = fill_oids_from_packs(&ctx, pack_indexes)))
goto cleanup;
}

if (commits) {
if ((res = fill_oids_from_commits(ctx, commits)))
if ((res = fill_oids_from_commits(&ctx, commits)))
goto cleanup;
}

if (!pack_indexes && !commits) {
ctx->order_by_pack = 1;
fill_oids_from_all_packs(ctx);
ctx.order_by_pack = 1;
fill_oids_from_all_packs(&ctx);
}

close_reachable(ctx);
close_reachable(&ctx);

copy_oids_to_commits(ctx);
copy_oids_to_commits(&ctx);

if (ctx->commits.nr >= GRAPH_EDGE_LAST_MASK) {
if (ctx.commits.nr >= GRAPH_EDGE_LAST_MASK) {
error(_("too many commits to write graph"));
res = -1;
goto cleanup;
}

if (!ctx->commits.nr && !replace)
if (!ctx.commits.nr && !replace)
goto cleanup;

if (ctx->split) {
split_graph_merge_strategy(ctx);
if (ctx.split) {
split_graph_merge_strategy(&ctx);

if (!replace)
merge_commit_graphs(ctx);
merge_commit_graphs(&ctx);
} else
ctx->num_commit_graphs_after = 1;
ctx.num_commit_graphs_after = 1;

ctx->trust_generation_numbers = validate_mixed_generation_chain(ctx->r->objects->commit_graph);
ctx.trust_generation_numbers = validate_mixed_generation_chain(ctx.r->objects->commit_graph);

compute_topological_levels(ctx);
if (ctx->write_generation_data)
compute_generation_numbers(ctx);
compute_topological_levels(&ctx);
if (ctx.write_generation_data)
compute_generation_numbers(&ctx);

if (ctx->changed_paths)
compute_bloom_filters(ctx);
if (ctx.changed_paths)
compute_bloom_filters(&ctx);

res = write_commit_graph_file(ctx);
res = write_commit_graph_file(&ctx);

if (ctx->changed_paths)
if (ctx.changed_paths)
deinit_bloom_filters();

if (ctx->split)
mark_commit_graphs(ctx);
if (ctx.split)
mark_commit_graphs(&ctx);

expire_commit_graphs(ctx);
expire_commit_graphs(&ctx);

cleanup:
free(ctx->graph_name);
free(ctx->base_graph_name);
free(ctx->commits.list);
oid_array_clear(&ctx->oids);
free(ctx.graph_name);
free(ctx.base_graph_name);
free(ctx.commits.list);
oid_array_clear(&ctx.oids);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:47PM +0000, Johannes Schindelin via GitGitGadget wrote:

> The code is a bit too hard to reason about to fully assess whether the
> `fill_commit_graph_info()` function is called at all after
> `write_commit_graph()` returns (and hence the stack variable
> `topo_levels` goes out of context).
> 
> Let's simply make sure that the stack address is no longer used at that
> stage, thereby making the code quite a bit easier to reason about.

Yep, I think this is a good practice in general. If the topo_levels
member is never used outside of writing, I wonder if it could live in a
separate data structure. But that is a much bigger refactor that I don't
think we need to tackle here.

> diff --git a/commit-graph.c b/commit-graph.c
> index 9f0115dac9b5..d052c1bf15c5 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -2683,6 +2683,15 @@ cleanup:
>  	oid_array_clear(&ctx.oids);
>  	clear_topo_level_slab(&topo_levels);
>  
> +	if (ctx.r->objects->commit_graph) {
> +		struct commit_graph *g = ctx.r->objects->commit_graph;
> +
> +		while (g) {
> +			g->topo_levels = NULL;
> +			g = g->base_graph;
> +		}
> +	}

This just clears the pointers to the local variable. Looks good.

-Peff

clear_topo_level_slab(&topo_levels);

for (i = 0; i < ctx->num_commit_graphs_before; i++)
free(ctx->commit_graph_filenames_before[i]);
free(ctx->commit_graph_filenames_before);
for (i = 0; i < ctx.num_commit_graphs_before; i++)
free(ctx.commit_graph_filenames_before[i]);
free(ctx.commit_graph_filenames_before);

for (i = 0; i < ctx->num_commit_graphs_after; i++) {
free(ctx->commit_graph_filenames_after[i]);
free(ctx->commit_graph_hash_after[i]);
for (i = 0; i < ctx.num_commit_graphs_after; i++) {
free(ctx.commit_graph_filenames_after[i]);
free(ctx.commit_graph_hash_after[i]);
}
free(ctx->commit_graph_filenames_after);
free(ctx->commit_graph_hash_after);

free(ctx);
free(ctx.commit_graph_filenames_after);
free(ctx.commit_graph_hash_after);

return res;
}
Expand Down
2 changes: 1 addition & 1 deletion help.c
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,7 @@ void exclude_cmds(struct cmdnames *cmds, struct cmdnames *excludes)
else if (cmp == 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:45PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <[email protected]>
> 
> While `if (i <= 0) ... else if (i > 0) ...` is technically equivalent to
> `if (i <= 0) ... else ...`, the latter is vastly easier to read because
> it avoids writing out a condition that is unnecessary. Let's drop such
> unnecessary conditions.
> 
> Pointed out by CodeQL.

Yeah, I'd agree that it is easier (otherwise, you are left wondering if
there is an "else" case you are missing).

>  help.c             | 2 +-
>  transport-helper.c | 2 +-

Both spots look good to me.

-Peff

ei++;
free(cmds->names[ci++]);
} else if (cmp > 0)
} else
ei++;
}

Expand Down
55 changes: 13 additions & 42 deletions read-cache.c
Original file line number Diff line number Diff line change
Expand Up @@ -1117,48 +1117,19 @@ static int has_dir_name(struct index_state *istate,
*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:43PM +0000, Johannes Schindelin via GitGitGadget wrote:

> One thing that might be non-obvious to readers (or to analyzers like
> CodeQL) is that the function essentially does nothing when the Git index
> is empty, and in particular that it does not look at the value of
> `len_eq_last` (which would be uninitialized at that point).
> 
> Let's make this much easier to understand, by returning early if the Git
> index is empty, and by avoiding empty `else` blocks.

OK, so we return early, skipping not only what's in the conditional
that you're touching here, but also the "for(;;)" loop below.

And in that one, we'll look for the next slash (and break if none).
We'll check the name up to that stage via index_name_stage_pos(). And
obviously that will not find a match if there are no index entries. So
we'd do nothing and loop again, looking for the next slash, until we
eventually hit the end.

So yeah, I agree if there are no index entries we can bail immediately.

> This commit changes indentation and is hence best viewed using
> `--ignore-space-change`.

Yeah. I was puzzled at first by the amount of dropped code, but they are
all comments that say "fall through to the code below".

So I think the change here is correct. We are losing some comments that
could be helpful, but I'm not familiar enough with those code to say
whether they would be. Just reading what you've left makes sense to me
on its own.

-Peff

* Compare the entry's full path with the last path in the index.
*/
if (istate->cache_nr > 0) {
cmp_last = strcmp_offset(name,
istate->cache[istate->cache_nr - 1]->name,
&len_eq_last);
if (cmp_last > 0) {
if (name[len_eq_last] != '/') {
/*
* The entry sorts AFTER the last one in the
* index.
*
* If there were a conflict with "file", then our
* name would start with "file/" and the last index
* entry would start with "file" but not "file/".
*
* The next character after common prefix is
* not '/', so there can be no conflict.
*/
return retval;
} else {
/*
* The entry sorts AFTER the last one in the
* index, and the next character after common
* prefix is '/'.
*
* Either the last index entry is a file in
* conflict with this entry, or it has a name
* which sorts between this entry and the
* potential conflicting file.
*
* In both cases, we fall through to the loop
* below and let the regular search code handle it.
*/
}
} else if (cmp_last == 0) {
/*
* The entry exactly matches the last one in the
* index, but because of multiple stage and CE_REMOVE
* items, we fall through and let the regular search
* code handle it.
*/
}
}
if (!istate->cache_nr)
return 0;

cmp_last = strcmp_offset(name,
istate->cache[istate->cache_nr - 1]->name,
&len_eq_last);
if (cmp_last > 0 && name[len_eq_last] != '/')
/*
* The entry sorts AFTER the last one in the
* index and their paths have no common prefix,
* so there cannot be a F/D conflict.
*/
return 0;

for (;;) {
size_t len;
Expand Down
24 changes: 5 additions & 19 deletions trace2/tr2_tmr.c
Original file line number Diff line number Diff line change
Expand Up @@ -102,25 +102,11 @@ void tr2_update_final_timers(void)
struct tr2_timer *t_final = &final_timer_block.timer[tid];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, May 15, 2025 at 01:11:46PM +0000, Johannes Schindelin via GitGitGadget wrote:

> CodeQL reports empty `if` blocks that only contain a comment as "futile
> conditional". The comment talks about potential plans to turn this into
> a warning, but that seems not to have been necessary. Replace the entire
> construct with a concise comment.

OK...

> -		if (t->recursion_count) {
> -			/*
> -			 * The current thread is exiting with
> -			 * timer[tid] still running.
> -			 *
> -			 * Technically, this is a bug, but I'm going
> -			 * to ignore it.
> -			 *
> -			 * I don't think it is worth calling die()
> -			 * for.  I don't think it is worth killing the
> -			 * process for this bookkeeping error.  We
> -			 * might want to call warning(), but I'm going
> -			 * to wait on that.
> -			 *
> -			 * The downside here is that total_ns won't
> -			 * include the current open interval (now -
> -			 * start_ns).  I can live with that.
> -			 */
> -		}
> +		/*
> +		 * `t->recursion_count` could technically be non-zero, which
> +		 * would constitute a bug. Reporting the bug would potentially
> +		 * cause an infinite recursion, though, so let's ignore it.
> +		 */

The original doesn't talk about infinite recursion at all, though I can
well believe that would be the case, having run into trace->die->trace
types of bugs before. Did you trace out the actual path of recursion? If
so, it might be worth summarizing it.

Obviously the code change itself cannot hurt anything, as it was a noop.

-Peff

struct tr2_timer *t = &ctx->timer_block.timer[tid];

if (t->recursion_count) {
/*
* The current thread is exiting with
* timer[tid] still running.
*
* Technically, this is a bug, but I'm going
* to ignore it.
*
* I don't think it is worth calling die()
* for. I don't think it is worth killing the
* process for this bookkeeping error. We
* might want to call warning(), but I'm going
* to wait on that.
*
* The downside here is that total_ns won't
* include the current open interval (now -
* start_ns). I can live with that.
*/
}
/*
* `t->recursion_count` could technically be non-zero, which
* would constitute a bug. Reporting the bug would potentially
* cause an infinite recursion, though, so let's ignore it.
*/

if (!t->interval_count)
continue; /* this timer was not used by this thread */
Expand Down
2 changes: 1 addition & 1 deletion transport-helper.c
Original file line number Diff line number Diff line change
Expand Up @@ -1437,7 +1437,7 @@ static int udt_do_read(struct unidirectional_transfer *t)
transfer_debug("%s EOF (with %i bytes in buffer)",
t->src_name, (int)t->bufuse);
t->state = SSTATE_FLUSHING;
} else if (bytes > 0) {
} else {
t->bufuse += bytes;
transfer_debug("Read %i bytes from %s (buffer now at %i)",
(int)bytes, t->src_name, (int)t->bufuse);
Expand Down
Loading