Skip to content

Conversation

@derrickstolee
Copy link

@derrickstolee derrickstolee commented Nov 26, 2025

In September [1], we discussed that the Scalar config options could use some documented justification as well as some comments to the config file that they were set by Scalar. I was then immediately distracted by other work things and am finally here with a series to do just that.

[1] https://lore.kernel.org/git/[email protected]/

I have indeed used Patrick's idea to add '# set by scalar' to each line added by Scalar, it took a little more work for all the kinds of config set. I made myself a co-author.

While working to justify each config option, I found some stale or incorrect config options. I also relaxed the override setting in most cases which gave me an opportunity to alphabetize the settings.

There was at least one case (I'm thinking of core.fscache here) where the config doesn't even exist in core Git, but instead in Git for Windows. We'll need to adjust in that fork to reinclude it in the right place.

Updates in V2

  • The config-setting code is simplified somewhat.
  • Use 'sane_unset' instead of 'export' in test.
  • Documentation is improved for typos, grammar, and clarity.

Thanks,
-Stolee

cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: Matthew Hughes [email protected]

@derrickstolee derrickstolee self-assigned this Nov 26, 2025
@derrickstolee derrickstolee force-pushed the scalar-config branch 2 times, most recently from f38ead9 to 18580f0 Compare November 26, 2025 19:58
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 26, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2010/derrickstolee/scalar-config-v1

To fetch this version to local tag pr-2010/derrickstolee/scalar-config-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2010/derrickstolee/scalar-config-v1

#include "refs.h"
#include "dir.h"
#include "packfile.h"
#include "help.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> Add "# set by scalar" to the end of each config option to assist users
> in identifying why these config options were set in their repo.

The implementation is quite straight-forward, inlining expansion of
repo_config_set_gently() in the places that we want to add comment to.

If we had (a lot) more than two callsites, I would have suggested to
add a simple helper function, something like

    static int scalar_config_set(struct repository *r, const char *key, const char *value)
    {
	char *file = repo_git_path(r, "config");
        int res = repo_config_set_multivar_in_file_gently(r, file,
		key, value, NULL, " # set by scalar", 0);
	free(file);
	return res;
    }

and then the updates to the callers would have been absolute minimum.

Well, even with only two callsites, perhaps such a refactoring may
still have value in reducing the risk of typo in the comment.

> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
>  	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
>  	test_path_is_file one/src/cron.txt &&
>  	test true = "$(git -C one/src config core.preloadIndex)" &&
> +	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> +	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
>  	test_subcommand git maintenance start <reconfigure &&
>  	test_subcommand ! git maintenance unregister --force <reconfigure &&

Looks good.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 03:55:10PM -0800, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
> > Add "# set by scalar" to the end of each config option to assist users
> > in identifying why these config options were set in their repo.
> 
> The implementation is quite straight-forward, inlining expansion of
> repo_config_set_gently() in the places that we want to add comment to.
> 
> If we had (a lot) more than two callsites, I would have suggested to
> add a simple helper function, something like
> 
>     static int scalar_config_set(struct repository *r, const char *key, const char *value)
>     {
> 	char *file = repo_git_path(r, "config");
>         int res = repo_config_set_multivar_in_file_gently(r, file,
> 		key, value, NULL, " # set by scalar", 0);
> 	free(file);
> 	return res;
>     }
> 
> and then the updates to the callers would have been absolute minimum.
> 
> Well, even with only two callsites, perhaps such a refactoring may
> still have value in reducing the risk of typo in the comment.

Agreed, I think it's a good idea to provide such a function. The calls
to `repo_config_set_multivar_in_file_gently()` are quite verbose.

Patrick

scalar.c Outdated
#endif
{ "core.logAllRefUpdates", "true", 1 },
{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
{ "credential.validate", "false", 1 }, /* GCM4W-only */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index 43c210a23d..91d5964b73 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -246,6 +246,11 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
>  '
>  
>  test_expect_success 'scalar reconfigure --all with detached HEADs' '
> +	# This test demonstrates an issue with index.skipHash=true and
> +	# this test variable for the split index. Disable the test variable.
> +	GIT_TEST_SPLIT_INDEX= &&
> +	export GIT_TEST_SPLIT_INDEX &&

Interesting.  I would have expected to see a simple "sane_unset",
instead of exporting an empty setting explicitly.

>  	repos="two three four" &&
>  	for num in $repos
>  	do

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/26/2025 6:57 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
>> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
>> index 43c210a23d..91d5964b73 100755
>> --- a/t/t9210-scalar.sh
>> +++ b/t/t9210-scalar.sh
>> @@ -246,6 +246,11 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
>>  '
>>  
>>  test_expect_success 'scalar reconfigure --all with detached HEADs' '
>> +	# This test demonstrates an issue with index.skipHash=true and
>> +	# this test variable for the split index. Disable the test variable.
>> +	GIT_TEST_SPLIT_INDEX= &&
>> +	export GIT_TEST_SPLIT_INDEX &&
> 
> Interesting.  I would have expected to see a simple "sane_unset",
> instead of exporting an empty setting explicitly.

That's indeed a better way to do it. Will do in v2.

Thanks,
-Stolee


static int set_recommended_config(int reconfigure)
{
struct scalar_config config[] = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> From: Derrick Stolee <[email protected]>
>
> These config values were added in the original Scalar contribution,
> d0feac4e8c (scalar: 'register' sets recommended config and starts
> maintenance, 2021-12-03), but were never fully checked for validity in
> the upstream Git project. At the time, Scalar was only intended for the
> contrib/ directory so did not have as rigorous of an investigation.
>
> Each config option has its own justification for removal:
>
> * core.preloadIndex: This value is true by default, now. Removing this
>   causes some changes required to the tests that checked this config
>   value. Use gui.gcwarning=false instead.
>
> * core.fscache: This config does not exist in the core Git project, but
>   is instead a config option for a Git for Windows feature.
>
> * core.multiPackIndex: This config value is now enabled by default, so
>   does not need to be called out specifically. It was originally
>   included to make sure the background maintenance that created
>   multi-pack-indexes would result in the expected performance
>   improvements.
>
> * credential.validate: This option is not something specific to Git but
>   instead an older version of Git Credential Manager for Windows. That
>   software was replaced several years ago by the cross-platform Git
>   Credential Manger so this option is no longer needed to help users who
>   were on that older software.
>
> * pack.useSparse=true: This value is now Git's default as of de3a864114
>   (config: set pack.useSparse=true by default, 2020-03-20) so we don't
>   need it set by Scalar.

Thanks for a conprehensive list.  Very well described.

~~~~~~

delete <enlistment>::
This subcommand lets you delete an existing Scalar enlistment from your
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> +commitGraph.generationVersion=1::
> +	While the preferred version is 2 for performance reasons, existing users
> +	that had version 1 by default will need special care in upgrading to
> +	version 2. This is likely to change in the future as the upgrade story
> +	is solidifies.

"as the upgrade story solidifies"?

> +fetch.writeCommitGraph=false::
> +	This config setting was created to help users automatically udpate their
> +	commit-graph files as they perform fetches. However, this takes time
> +	from foreground fetches and pulls and Scalar uses background maintenance
> +	for this function instead.

"update their files".

> +index.threads=true::
> +	This tells Git to automatically detect how many threads it should use
> +	when reading the index in parallel due to the `core.preloadIndex=true`
> +	setting.

Is "due to the `core.preloadIndex=true` setting" part of this
sentence still relevant?


Other than that, superbly written.  Thanks, will queue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/26/2025 7:09 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
>> +commitGraph.generationVersion=1::
>> +	While the preferred version is 2 for performance reasons, existing users
>> +	that had version 1 by default will need special care in upgrading to
>> +	version 2. This is likely to change in the future as the upgrade story
>> +	is solidifies.
> 
> "as the upgrade story solidifies"?

That's better than what I was going for which was "is solidified". Will fix.

>> +fetch.writeCommitGraph=false::
>> +	This config setting was created to help users automatically udpate their
>> +	commit-graph files as they perform fetches. However, this takes time
>> +	from foreground fetches and pulls and Scalar uses background maintenance
>> +	for this function instead.
> 
> "update their files".

Yes. thanks.

>> +index.threads=true::
>> +	This tells Git to automatically detect how many threads it should use
>> +	when reading the index in parallel due to the `core.preloadIndex=true`
>> +	setting.
> 
> Is "due to the `core.preloadIndex=true` setting" part of this
> sentence still relevant?

I should still include this, but mention that it is enabled by default and
still recommended.

> Other than that, superbly written.  Thanks, will queue.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Nov 29, 2025

This patch series was integrated into seen via git@175d67a.

@gitgitgadget gitgitgadget bot added the seen label Nov 29, 2025
@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

This branch is now known as ds/doc-scalar-config.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

This patch series was integrated into seen via git@d20a0d3.

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

There was a status update in the "Cooking" section about the branch ds/doc-scalar-config on the Git mailing list:

Documentation updates.

Expecting a reroll.
cf. <[email protected]>
source: <[email protected]>

fsm_settings__get_reason(the_repository) == FSMONITOR_REASON_OK;
}

static int set_recommended_config(int reconfigure)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <[email protected]>
> 
> The config values set by Scalar went through an audit in the previous
> changes, so now reorganize the settings and simplify their purpose.
> 
> First, alphabetize the config options, except put the platform-specific
> options at the end. This groups two Windows-specific settings and only
> one non-Windows setting.
> 
> Also, this removes the 'overwrite_on_reconfigure' setting for many of
> these options. That setting made nearly all of these options "required"
> for scalar enlistments, restricting use for users. Instead, now nearly
> all options have removed this setting.

As far as I understand, this setting causes us to overwrite any
preexisting config values when reconfiguring Scalar? So with your
changes the effect is that we now don't do that anymore, which allows
the user to tune some of the configuration values to their liking after
having run `scalar init` for the first time. I guess that makes sense,
as it gives the user more flexibility.

It does make me wonder though: is it really the most sensible thing to
overwrite any keys that already exist in the configuration? We may end
up overwriting configuration specified by the user both in the case of
`scalar init` and `scalar reconfigure`. But arguably, we might want to
only ever write configuration that does _not_ yet have an explicit value
in the configuration file, regardless of whether or not we reconfigure.

> However, there is one setting that still has this, which is
> index.skipHash, which was previously being set to _false_ when we
> actually prefer the value of true. Keep the overwrite here to help
> Scalar users upgrade to the new version. We may remove that overwrite in
> the future once we belive that most of the users who have the false
> value have upgraded to a version that overwrites that to 'true'.

Makes sense. This has likely been a bug, and we now want to rectify that
bug.

Thanks!

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <[email protected]>
>>
>> The config values set by Scalar went through an audit in the previous
>> changes, so now reorganize the settings and simplify their purpose.
>>
>> First, alphabetize the config options, except put the platform-specific
>> options at the end. This groups two Windows-specific settings and only
>> one non-Windows setting.
>>
>> Also, this removes the 'overwrite_on_reconfigure' setting for many of
>> these options. That setting made nearly all of these options "required"
>> for scalar enlistments, restricting use for users. Instead, now nearly
>> all options have removed this setting.
> > As far as I understand, this setting causes us to overwrite any
> preexisting config values when reconfiguring Scalar? So with your
> changes the effect is that we now don't do that anymore, which allows
> the user to tune some of the configuration values to their liking after
> having run `scalar init` for the first time. I guess that makes sense,
> as it gives the user more flexibility.

Yes, that is correct.

> It does make me wonder though: is it really the most sensible thing to
> overwrite any keys that already exist in the configuration? We may end
> up overwriting configuration specified by the user both in the case of
> `scalar init` and `scalar reconfigure`. But arguably, we might want to
> only ever write configuration that does _not_ yet have an explicit value
> in the configuration file, regardless of whether or not we reconfigure.

I agree that this notion of forcing config is not optimal, and is a leftover
from VFS for Git where some of these config things were actually required
for the virtualization to work. Once that idea was in place, it was easy
to think "we'll make sure the repo is configured correctly" but that makes
much less sense in Scalar these days.

>> However, there is one setting that still has this, which is
>> index.skipHash, which was previously being set to _false_ when we
>> actually prefer the value of true. Keep the overwrite here to help
>> Scalar users upgrade to the new version. We may remove that overwrite in
>> the future once we belive that most of the users who have the false
>> value have upgraded to a version that overwrites that to 'true'.
> > Makes sense. This has likely been a bug, and we now want to rectify that
> bug.

And hopefully this is the only reason we'd need this "overwrite" feature
from this point on.

Thanks,
-Stolee

~~~~~~

delete <enlistment>::
This subcommand lets you delete an existing Scalar enlistment from your
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
> index f81b2832f8..b34af225e6 100644
> --- a/Documentation/scalar.adoc
> +++ b/Documentation/scalar.adoc
> @@ -197,6 +197,164 @@ delete <enlistment>::
>  	This subcommand lets you delete an existing Scalar enlistment from your
>  	local file system, unregistering the repository.
>  
> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------
> +
> +As part of both `scalar clone` and `scalar register`, certain Git config
> +values are set to optimize for large repositories or cross-platform support.
> +These options are updated in new Git versions according to the best known
> +advice for large repositories, and users can get the latest recommendations
> +by running `scalar reconfigure [--all]`.
> +
> +This section lists justifications for the config values that are set in the
> +latest version.
> +
> +am.keepCR=true::
> +	This setting is important for cross-platform development across Windows
> +	and non-Windows platforms and keeping carriage return (`\r`) characters
> +	in certain workflows.
> +
> +commitGraph.changedPaths=true::
> +	This setting helps the background maintenance steps that compute the
> +	serialized commit-graph to also store changed-path Bloom filters. This
> +	accelerates file history commands and allows users to automatically
> +	benefit without running a foreground command.

Is this something we also want to promote to "default" eventually? The
downside of course is that maintenance takes a bit longer, but given
that it runs in the background anyway this shouldn't really impact our
users all that much.

> +commitGraph.generationVersion=1::
> +	While the preferred version is 2 for performance reasons, existing users
> +	that had version 1 by default will need special care in upgrading to
> +	version 2. This is likely to change in the future as the upgrade story
> +	is solidifies.

Is that still the case? We _did_ have some bugs in the upgrade path in
the past, but I thought it got all sorted out by now?

[snip]
> +fetch.unpackLimit=1::
> +	This setting prevents Git from unpacking packfiles into loose objects
> +	as they are downloaded from the server. This feature was intended as a
> +	way to prevent performance issues from too many packfiles, but Scalar
> +	uses background maintenance to group packfiles and cover them with a
> +	multi-pack-index, removing this issue.

The second sentence here reads as if "fetch.unpackLimit=1" was the
feature you are talking about, which led to some puzzlement at first.
But what you are talking about is the _default_ unpack limit of 100.
Maybe something like this reads better?

    This setting prevents Git from unpacking packfiles into loose objects
    as they are downloaded from the server. The default limit of 100
    objects was intended as a way to prevent performance issues from too
    many packfiles, but Scalar uses background maintenance to group
    packfiles and cover them with a multi-pack-index, removing this
    issue.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
>> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
>> index f81b2832f8..b34af225e6 100644
>> --- a/Documentation/scalar.adoc
>> +++ b/Documentation/scalar.adoc
>> @@ -197,6 +197,164 @@ delete <enlistment>::
>>   	This subcommand lets you delete an existing Scalar enlistment from your
>>   	local file system, unregistering the repository.
>>   >> +REQUIRED AND RECOMMENDED CONFIG
>> +-------------------------------
>> +
>> +As part of both `scalar clone` and `scalar register`, certain Git config
>> +values are set to optimize for large repositories or cross-platform support.
>> +These options are updated in new Git versions according to the best known
>> +advice for large repositories, and users can get the latest recommendations
>> +by running `scalar reconfigure [--all]`.
>> +
>> +This section lists justifications for the config values that are set in the
>> +latest version.
>> +
>> +am.keepCR=true::
>> +	This setting is important for cross-platform development across Windows
>> +	and non-Windows platforms and keeping carriage return (`\r`) characters
>> +	in certain workflows.
>> +
>> +commitGraph.changedPaths=true::
>> +	This setting helps the background maintenance steps that compute the
>> +	serialized commit-graph to also store changed-path Bloom filters. This
>> +	accelerates file history commands and allows users to automatically
>> +	benefit without running a foreground command.
> > Is this something we also want to promote to "default" eventually? The
> downside of course is that maintenance takes a bit longer, but given
> that it runs in the background anyway this shouldn't really impact our
> users all that much.

I'm not sure, as this is a significant cost to the computation time. It will
impact foreground commands, as well. It increases the size of the file, too.

It's worth considering, but I don't think the answer is very simple.

>> +commitGraph.generationVersion=1::
>> +	While the preferred version is 2 for performance reasons, existing users
>> +	that had version 1 by default will need special care in upgrading to
>> +	version 2. This is likely to change in the future as the upgrade story
>> +	is solidifies.
> > Is that still the case? We _did_ have some bugs in the upgrade path in
> the past, but I thought it got all sorted out by now?

This is very likely, but I haven't validated myself. I'd be interested to
double-check and update this setting in a later series. If we update to 2,
then this would be a good reason to overwrite the old config for a while.

> [snip]
>> +fetch.unpackLimit=1::
>> +	This setting prevents Git from unpacking packfiles into loose objects
>> +	as they are downloaded from the server. This feature was intended as a
>> +	way to prevent performance issues from too many packfiles, but Scalar
>> +	uses background maintenance to group packfiles and cover them with a
>> +	multi-pack-index, removing this issue.
> > The second sentence here reads as if "fetch.unpackLimit=1" was the
> feature you are talking about, which led to some puzzlement at first.
> But what you are talking about is the _default_ unpack limit of 100.
> Maybe something like this reads better?
> >      This setting prevents Git from unpacking packfiles into loose objects
>      as they are downloaded from the server. The default limit of 100
>      objects was intended as a way to prevent performance issues from too
>      many packfiles, but Scalar uses background maintenance to group
>      packfiles and cover them with a multi-pack-index, removing this
>      issue.

Good catch. Thanks!

-Stolee

derrickstolee and others added 4 commits December 1, 2025 07:30
A repo may have config options set by 'scalar clone' or 'scalar
register' and then updated by 'scalar reconfigure'. It can be helpful to
point out which of those options were set by the latest scalar
recommendations.

Add "# set by scalar" to the end of each config option to assist users
in identifying why these config options were set in their repo. Use a new
helper method to simplify the two callsites.

Co-authored-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Derrick Stolee <[email protected]>
The index.skipHash config option has been set to 'false' by Scalar since
4933152 (scalar: enable path-walk during push via config, 2025-05-16)
but that commit message is trying to communicate the exact opposite:
that the 'true' value is what we want instead. This means that we've
been disabling this performance benefit for Scalar repos
unintentionally.

Fix this issue before we add justification for the config options set in
this list.

Oddly, enabling index.skipHash causes a test issue during 'test_commit'
in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as
caught by the linux-test-vars build). I'm fixing the test by disabling
the environment variable, but the issue should be resolved in a series
focused on the split index.

Signed-off-by: Derrick Stolee <[email protected]>
These config values were added in the original Scalar contribution,
d0feac4 (scalar: 'register' sets recommended config and starts
maintenance, 2021-12-03), but were never fully checked for validity in
the upstream Git project. At the time, Scalar was only intended for the
contrib/ directory so did not have as rigorous of an investigation.

Each config option has its own justification for removal:

* core.preloadIndex: This value is true by default, now. Removing this
  causes some changes required to the tests that checked this config
  value. Use gui.gcwarning=false instead.

* core.fscache: This config does not exist in the core Git project, but
  is instead a config option for a Git for Windows feature.

* core.multiPackIndex: This config value is now enabled by default, so
  does not need to be called out specifically. It was originally
  included to make sure the background maintenance that created
  multi-pack-indexes would result in the expected performance
  improvements.

* credential.validate: This option is not something specific to Git but
  instead an older version of Git Credential Manager for Windows. That
  software was replaced several years ago by the cross-platform Git
  Credential Manger so this option is no longer needed to help users who
  were on that older software.

* pack.useSparse=true: This value is now Git's default as of de3a864
  (config: set pack.useSparse=true by default, 2020-03-20) so we don't
  need it set by Scalar.

Signed-off-by: Derrick Stolee <[email protected]>
The config values set by Scalar went through an audit in the previous
changes, so now reorganize the settings and simplify their purpose.

First, alphabetize the config options, except put the platform-specific
options at the end. This groups two Windows-specific settings and only
one non-Windows setting.

Also, this removes the 'overwrite_on_reconfigure' setting for many of
these options. That setting made nearly all of these options "required"
for scalar enlistments, restricting use for users. Instead, now nearly
all options have removed this setting.

However, there is one setting that still has this, which is
index.skipHash, which was previously being set to _false_ when we
actually prefer the value of true. Keep the overwrite here to help
Scalar users upgrade to the new version. We may remove that overwrite in
the future once we belive that most of the users who have the false
value have upgraded to a version that overwrites that to 'true'.

Signed-off-by: Derrick Stolee <[email protected]>
Add user-facing documentation that justifies the values being set by
'scalar clone', 'scalar register', and 'scalar reconfigure'.

Helped-by: Junio C Hamano <[email protected]>
Helped-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Derrick Stolee <[email protected]>
@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Stolee,

On Wed, 26 Nov 2025, Derrick Stolee via GitGitGadget wrote:

> In September [1], we discussed that the Scalar config options could use some
> documented justification as well as some comments to the config file that
> they were set by Scalar. I was then immediately distracted by other work
> things and am finally here with a series to do just that.

Thank you for doing this, in particular the (quite long!) list of
explanations are excellent, especially when some user wonders why a
particular setting was chosen and wants to understand the reason.

> 
> [1]
> https://lore.kernel.org/git/[email protected]/
> 
> I have indeed used Patrick's idea to add '# set by scalar' to each line
> added by Scalar, it took a little more work for all the kinds of config set.

I am glad that the work I put in to optionally add comments pays off.

It's a bit sad that there is no well-designed bulk-edit "API" function
which therefore requires constructing and `free()`ing that `file` variable
many times, but that's not the fault of this series.

> I made myself a co-author.
> 
> While working to justify each config option, I found some stale or incorrect
> config options. I also relaxed the override setting in most cases which gave
> me an opportunity to alphabetize the settings.
> 
> There was at least one case (I'm thinking of core.fscache here) where the
> config doesn't even exist in core Git, but instead in Git for Windows. We'll
> need to adjust in that fork to reinclude it in the right place.

Thank you for calling this out! I will take care of this in Git for
Windows and also in Microsoft Git (which inherits this flag from Git for
Windows).

To be honest, I am not so certain that we want the FSCache to be enabled,
it does have long-standing bugs (introduced by the partial clone feature,
for example, where the FSCache continues to retain stale information about
which loose objects are present even after the missing ones have been
fetched). I guess we'll have to measure the actual performance benefits to
reassess whether the feature is worth the trouble.

Thank you for your diligent work, as always,
Johannes

> 
> Thanks, -Stolee
> 
> Derrick Stolee (5):
>   scalar: annotate config file with "set by scalar"
>   scalar: use index.skipHash=true for performance
>   scalar: remove stale config values
>   scalar: alphabetize and simplify config
>   scalar: document config settings
> 
>  Documentation/scalar.adoc | 158 ++++++++++++++++++++++++++++++++++++++
>  scalar.c                  |  81 ++++++++++---------
>  t/t9210-scalar.sh         |  26 ++++---
>  3 files changed, 218 insertions(+), 47 deletions(-)
> 
> 
> base-commit: 6ab38b7e9cc7adafc304f3204616a4debd49c6e9
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2010%2Fderrickstolee%2Fscalar-config-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2010/derrickstolee/scalar-config-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/2010
> -- 
> gitgitgadget
> 

@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2010/derrickstolee/scalar-config-v2

To fetch this version to local tag pr-2010/derrickstolee/scalar-config-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2010/derrickstolee/scalar-config-v2


static int set_recommended_config(int reconfigure)
{
struct scalar_config config[] = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> * core.preloadIndex: This value is true by default, now. Removing this
>   causes some changes required to the tests that checked this config
>   value. Use gui.gcwarning=false instead.

I was going to ask about if we could also rely on the default value of
index.threads like we do here, but then went and did some reading and realised
some config values, like index.recordOffsetTable, have their value set
according to whether index.threads was explicitly set, so I guess there's an
implicit reliance on that behaviour that we want to keep?

> * core.fscache: This config does not exist in the core Git project, but
>   is instead a config option for a Git for Windows feature.
> 
> * core.multiPackIndex: This config value is now enabled by default, so
>   does not need to be called out specifically. It was originally
>   included to make sure the background maintenance that created
>   multi-pack-indexes would result in the expected performance
>   improvements.
> 
> * credential.validate: This option is not something specific to Git but
>   instead an older version of Git Credential Manager for Windows. That
>   software was replaced several years ago by the cross-platform Git
>   Credential Manger so this option is no longer needed to help users who
>   were on that older software.
> 
> * pack.useSparse=true: This value is now Git's default as of de3a864114
>   (config: set pack.useSparse=true by default, 2020-03-20) so we don't
>   need it set by Scalar.

Thanks for the detail on all of these, very helpful

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:46:46PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> > * core.preloadIndex: This value is true by default, now. Removing this
> >   causes some changes required to the tests that checked this config
> >   value. Use gui.gcwarning=false instead.
> 
> I was going to ask about if we could also rely on the default value of
> index.threads like we do here, but then went and did some reading and realised
> some config values, like index.recordOffsetTable, have their value set
> according to whether index.threads was explicitly set, so I guess there's an
> implicit reliance on that behaviour that we want to keep?

Wait. Are you saying that "index.recordOffsetTable" behaves differently
based on whether "index.threads" is implicitly enabled due to the
default value or explicitly enabled via the configuration? If so, that
smells like a plain bug to me.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> Wait. Are you saying that "index.recordOffsetTable" behaves differently
> based on whether "index.threads" is implicitly enabled due to the
> default value or explicitly enabled via the configuration?

That was my understanding from a cursory read of the results of searching for
'index.threads' in git-config:

> index.recordEndOfIndexEntries
> ...
> Defaults to true if index.threads has been explicitly enabled, false
> otherwise

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Dec 02, 2025 at 07:04:24PM +0000, Matthew Hughes wrote:
> On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> > Wait. Are you saying that "index.recordOffsetTable" behaves differently
> > based on whether "index.threads" is implicitly enabled due to the
> > default value or explicitly enabled via the configuration?
> 
> That was my understanding from a cursory read of the results of searching for
> 'index.threads' in git-config:
> 
> > index.recordEndOfIndexEntries
> > ...
> > Defaults to true if index.threads has been explicitly enabled, false
> > otherwise

Hm, true. At least that's a concious decision then.

The logic around this was introduced in 2a9dedef2e (index: make
index.threads=true enable ieot and eoie, 2018-11-19), and the ultimate
reason for it seems to be backwards compatibility:

    index.threads and index.recordOffsetTable unspecified: do not write
    the offset table yet (to avoid alarming the user with "ignoring IEOT
    extension" messages when an older version of Git accesses the
    repository) but do make use of multiple threads to read the index if
    the supporting offset table is present.

Older versions of Git complained when they see unknown extensions, and
we didn't want to expose users to such warnings. That makes me wonder
whether it's time now to revisit that decision -- it's been 7 years
since then, I guess that many clients nowadays would understand the
extension. 

The only (documented) downside should thus not be that important
anymore, but the upside is that reading the index would be faster if we
default-enable writing the extension.

Patrick

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 1, 2025

User Matthew Hughes <[email protected]> has been added to the cc: list.

~~~~~~

delete <enlistment>::
This subcommand lets you delete an existing Scalar enlistment from your
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> Add user-facing documentation that justifies the values being set by
> 'scalar clone', 'scalar register', and 'scalar reconfigure'.

Thanks! This is exactly what I was hoping for.

> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------

Would it be worth noting in scalar.c that the config options listed there are
documented here, So that a dev changing the list in the source will know to
also update this? I assume there's an understanding that if e.g. you update a
flag you should know to also update relevant docs, but perhaps this is a bit
more niche.

> +gc.auto=0::
> +	This disables automatic garbage collection, since Scalar uses background
> +	maintenance to keep the repository data in good shape.

Checking my understanding: this means there will be _no_ automatic GC in a
scalar repo? Since scalar calls 'maintenance register' which means
maintenance.strategy will be set to 'incremental' which won't schedule any gc
runs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:58:06PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> > Add user-facing documentation that justifies the values being set by
> > 'scalar clone', 'scalar register', and 'scalar reconfigure'.
> 
> Thanks! This is exactly what I was hoping for.
> 
> > +REQUIRED AND RECOMMENDED CONFIG
> > +-------------------------------
> 
> Would it be worth noting in scalar.c that the config options listed there are
> documented here, So that a dev changing the list in the source will know to
> also update this? I assume there's an understanding that if e.g. you update a
> flag you should know to also update relevant docs, but perhaps this is a bit
> more niche.
> 
> > +gc.auto=0::
> > +	This disables automatic garbage collection, since Scalar uses background
> > +	maintenance to keep the repository data in good shape.
> 
> Checking my understanding: this means there will be _no_ automatic GC in a
> scalar repo? Since scalar calls 'maintenance register' which means
> maintenance.strategy will be set to 'incremental' which won't schedule any gc
> runs

Yes, auto-garbage-collection is completely disabled in repositories
managed by Scalar. And I guess that made sense in the past:
auto-maintenance did not know about maintenance strategies at all, and
consequently it would still run git-gc(1). And that's not really
compatible with the "incremental" strategy that Scalar wants to use.

I changed that in Git 2.52 so that maintenance strategies now apply to
both scheduled and normal maintenance. But I was worried about backwards
compatibility for the "incremental" strategy, so I made the change in a
backwards compatible way so that normal maintenance still ends up using
git-gc(1).

Arguably though, we can now iterate on our infrastructure: if we were to
introduce an "incremental-v2" strategy we could adapt it to have proper
strategies for both scheduled and normal maintenance. And if so, we can
adapt Scalar in such a way that it doesn't have to disable auto
maintenance anymore.

I think that would be a reasonable thing to do. Scheduled maintenance
only runs once per hour, and in a high-activity repo a user may easily
generate tons of objects in that hour that make the repository perform
badly.

Patrick

@gitgitgadget
Copy link

gitgitgadget bot commented Dec 2, 2025

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

>      -@@ scalar.c: static int set_scalar_config(const struct scalar_config *config, int reconfigure
>      +@@ scalar.c: struct scalar_config {
>      + 	int overwrite_on_reconfigure;
>      + };
>      + 
>      ++static int set_config_with_comment(const char *key, const char *value)

I do not care too deeply as this is a file-scope static that is
called only twice, but I would have preferred scalar_set_config()
which is a lot more specificto the purpose of this function (and the
comment "# set by scalar" is hardcoded constant in this function
that its callers cannot affect, so "with_comment" is not even a
statement that "the callers can add comment to their config
settings") which would have taken a bit shorter line to call.

>       +fetch.unpackLimit=1::
>       +	This setting prevents Git from unpacking packfiles into loose objects
>      -+	as they are downloaded from the server. This feature was intended as a
>      -+	way to prevent performance issues from too many packfiles, but Scalar
>      -+	uses background maintenance to group packfiles and cover them with a
>      -+	multi-pack-index, removing this issue.
>      ++	as they are downloaded from the server. The default limit of 100 was
>      ++	intended as a way to prevent performance issues from too many packfiles,
>      ++	but Scalar uses background maintenance to group packfiles and cover them
>      ++	with a multi-pack-index, removing this issue.

Nicely explained.

Will replace (when I land).

Thanks.

#include "refs.h"
#include "dir.h"
#include "packfile.h"
#include "help.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:43PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
>  	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
>  	test_path_is_file one/src/cron.txt &&
>  	test true = "$(git -C one/src config core.preloadIndex)" &&
> +	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> +	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
>  	test_subcommand git maintenance start <reconfigure &&
>  	test_subcommand ! git maintenance unregister --force <reconfigure &&

We _could_ make this a bit more solid by adding a test that:

  1. Initializes a new repository.

  2. Saves the configuration.

  3. Performs `scalar reconfigure`.

  4. Asserts that all new non-section-header lines in the configuration
     have a trailing "#set by scalar" comment.

This would ensure that there is no callsite we forgot to add the new
annotation to, and that there are new future callsites where somebody
isn't aware of the comments.

I don't insist on such a test though, so please feel free to ignore this
suggestion.

Patrick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant