Audit and document Scalar config #2010

derrickstolee · 2025-11-26T17:17:47Z

In September [1], we discussed that the Scalar config options could use some documented justification as well as some comments to the config file that they were set by Scalar. I was then immediately distracted by other work things and am finally here with a series to do just that.

[1] https://lore.kernel.org/git/[email protected]/

I have indeed used Patrick's idea to add '# set by scalar' to each line added by Scalar, it took a little more work for all the kinds of config set. I made myself a co-author.

While working to justify each config option, I found some stale or incorrect config options. I also relaxed the override setting in most cases which gave me an opportunity to alphabetize the settings.

There was at least one case (I'm thinking of core.fscache here) where the config doesn't even exist in core Git, but instead in Git for Windows. We'll need to adjust in that fork to reinclude it in the right place.

Updates in V2

The config-setting code is simplified somewhat.
Use 'sane_unset' instead of 'export' in test.
Documentation is improved for typos, grammar, and clarity.

Thanks,
-Stolee

cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: Matthew Hughes [email protected]

derrickstolee · 2025-11-26T22:17:36Z

/submit

gitgitgadget · 2025-11-26T22:19:15Z

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2010/derrickstolee/scalar-config-v1

To fetch this version to local tag pr-2010/derrickstolee/scalar-config-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2010/derrickstolee/scalar-config-v1

gitgitgadget · 2025-11-26T23:57:48Z

scalar.c

 #include "refs.h"
 #include "dir.h"
 #include "packfile.h"
 #include "help.h"


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes: > Add "# set by scalar" to the end of each config option to assist users > in identifying why these config options were set in their repo. The implementation is quite straight-forward, inlining expansion of repo_config_set_gently() in the places that we want to add comment to. If we had (a lot) more than two callsites, I would have suggested to add a simple helper function, something like static int scalar_config_set(struct repository *r, const char *key, const char *value) { char *file = repo_git_path(r, "config"); int res = repo_config_set_multivar_in_file_gently(r, file, key, value, NULL, " # set by scalar", 0); free(file); return res; } and then the updates to the callers would have been absolute minimum. Well, even with only two callsites, perhaps such a refactoring may still have value in reducing the risk of typo in the comment. > diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh > index bd6f0c40d2..43c210a23d 100755 > --- a/t/t9210-scalar.sh > +++ b/t/t9210-scalar.sh > @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' ' > GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a && > test_path_is_file one/src/cron.txt && > test true = "$(git -C one/src config core.preloadIndex)" && > + test_grep "preloadIndex = true # set by scalar" one/src/.git/config && > + test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config && > + > test_subcommand git maintenance start <reconfigure && > test_subcommand ! git maintenance unregister --force <reconfigure && Looks good.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 03:55:10PM -0800, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <[email protected]> writes: > > > Add "# set by scalar" to the end of each config option to assist users > > in identifying why these config options were set in their repo. > > The implementation is quite straight-forward, inlining expansion of > repo_config_set_gently() in the places that we want to add comment to. > > If we had (a lot) more than two callsites, I would have suggested to > add a simple helper function, something like > > static int scalar_config_set(struct repository *r, const char *key, const char *value) > { > char *file = repo_git_path(r, "config"); > int res = repo_config_set_multivar_in_file_gently(r, file, > key, value, NULL, " # set by scalar", 0); > free(file); > return res; > } > > and then the updates to the callers would have been absolute minimum. > > Well, even with only two callsites, perhaps such a refactoring may > still have value in reducing the risk of typo in the comment. Agreed, I think it's a good idea to provide such a function. The calls to `repo_config_set_multivar_in_file_gently()` are quite verbose. Patrick

gitgitgadget · 2025-11-27T01:50:39Z

scalar.c

 #endif
 		{ "core.logAllRefUpdates", "true", 1 },
 		{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
 		{ "credential.validate", "false", 1 }, /* GCM4W-only */


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes: > diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh > index 43c210a23d..91d5964b73 100755 > --- a/t/t9210-scalar.sh > +++ b/t/t9210-scalar.sh > @@ -246,6 +246,11 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' ' > ' > > test_expect_success 'scalar reconfigure --all with detached HEADs' ' > + # This test demonstrates an issue with index.skipHash=true and > + # this test variable for the split index. Disable the test variable. > + GIT_TEST_SPLIT_INDEX= && > + export GIT_TEST_SPLIT_INDEX && Interesting. I would have expected to see a simple "sane_unset", instead of exporting an empty setting explicitly. > repos="two three four" && > for num in $repos > do

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/26/2025 6:57 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <[email protected]> writes: > >> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh >> index 43c210a23d..91d5964b73 100755 >> --- a/t/t9210-scalar.sh >> +++ b/t/t9210-scalar.sh >> @@ -246,6 +246,11 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' ' >> ' >> >> test_expect_success 'scalar reconfigure --all with detached HEADs' ' >> + # This test demonstrates an issue with index.skipHash=true and >> + # this test variable for the split index. Disable the test variable. >> + GIT_TEST_SPLIT_INDEX= && >> + export GIT_TEST_SPLIT_INDEX && > > Interesting. I would have expected to see a simple "sane_unset", > instead of exporting an empty setting explicitly. That's indeed a better way to do it. Will do in v2. Thanks, -Stolee

gitgitgadget · 2025-11-27T01:51:03Z

scalar.c


 static int set_recommended_config(int reconfigure)
 {
 	struct scalar_config config[] = {


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes: > From: Derrick Stolee <[email protected]> > > These config values were added in the original Scalar contribution, > d0feac4e8c (scalar: 'register' sets recommended config and starts > maintenance, 2021-12-03), but were never fully checked for validity in > the upstream Git project. At the time, Scalar was only intended for the > contrib/ directory so did not have as rigorous of an investigation. > > Each config option has its own justification for removal: > > * core.preloadIndex: This value is true by default, now. Removing this > causes some changes required to the tests that checked this config > value. Use gui.gcwarning=false instead. > > * core.fscache: This config does not exist in the core Git project, but > is instead a config option for a Git for Windows feature. > > * core.multiPackIndex: This config value is now enabled by default, so > does not need to be called out specifically. It was originally > included to make sure the background maintenance that created > multi-pack-indexes would result in the expected performance > improvements. > > * credential.validate: This option is not something specific to Git but > instead an older version of Git Credential Manager for Windows. That > software was replaced several years ago by the cross-platform Git > Credential Manger so this option is no longer needed to help users who > were on that older software. > > * pack.useSparse=true: This value is now Git's default as of de3a864114 > (config: set pack.useSparse=true by default, 2020-03-20) so we don't > need it set by Scalar. Thanks for a conprehensive list. Very well described.

gitgitgadget · 2025-11-27T01:51:35Z

Documentation/scalar.adoc

 ~~~~~~

 delete <enlistment>::
 	This subcommand lets you delete an existing Scalar enlistment from your


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes: > +commitGraph.generationVersion=1:: > + While the preferred version is 2 for performance reasons, existing users > + that had version 1 by default will need special care in upgrading to > + version 2. This is likely to change in the future as the upgrade story > + is solidifies. "as the upgrade story solidifies"? > +fetch.writeCommitGraph=false:: > + This config setting was created to help users automatically udpate their > + commit-graph files as they perform fetches. However, this takes time > + from foreground fetches and pulls and Scalar uses background maintenance > + for this function instead. "update their files". > +index.threads=true:: > + This tells Git to automatically detect how many threads it should use > + when reading the index in parallel due to the `core.preloadIndex=true` > + setting. Is "due to the `core.preloadIndex=true` setting" part of this sentence still relevant? Other than that, superbly written. Thanks, will queue.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/26/2025 7:09 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <[email protected]> writes: > >> +commitGraph.generationVersion=1:: >> + While the preferred version is 2 for performance reasons, existing users >> + that had version 1 by default will need special care in upgrading to >> + version 2. This is likely to change in the future as the upgrade story >> + is solidifies. > > "as the upgrade story solidifies"? That's better than what I was going for which was "is solidified". Will fix. >> +fetch.writeCommitGraph=false:: >> + This config setting was created to help users automatically udpate their >> + commit-graph files as they perform fetches. However, this takes time >> + from foreground fetches and pulls and Scalar uses background maintenance >> + for this function instead. > > "update their files". Yes. thanks. >> +index.threads=true:: >> + This tells Git to automatically detect how many threads it should use >> + when reading the index in parallel due to the `core.preloadIndex=true` >> + setting. > > Is "due to the `core.preloadIndex=true` setting" part of this > sentence still relevant? I should still include this, but mention that it is enabled by default and still recommended. > Other than that, superbly written. Thanks, will queue. Thanks, -Stolee

gitgitgadget · 2025-11-29T07:19:35Z

This patch series was integrated into seen via git@175d67a.

gitgitgadget · 2025-12-01T04:01:53Z

This branch is now known as ds/doc-scalar-config.

gitgitgadget · 2025-12-01T04:01:54Z

This patch series was integrated into seen via git@d20a0d3.

gitgitgadget · 2025-12-01T05:41:37Z

There was a status update in the "Cooking" section about the branch ds/doc-scalar-config on the Git mailing list:

Documentation updates.

Expecting a reroll.
cf. <[email protected]>
source: <[email protected]>

gitgitgadget · 2025-12-01T09:21:55Z

scalar.c

 	       fsm_settings__get_reason(the_repository) == FSMONITOR_REASON_OK;
 }

 static int set_recommended_config(int reconfigure)


On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote: > From: Derrick Stolee <[email protected]> > > The config values set by Scalar went through an audit in the previous > changes, so now reorganize the settings and simplify their purpose. > > First, alphabetize the config options, except put the platform-specific > options at the end. This groups two Windows-specific settings and only > one non-Windows setting. > > Also, this removes the 'overwrite_on_reconfigure' setting for many of > these options. That setting made nearly all of these options "required" > for scalar enlistments, restricting use for users. Instead, now nearly > all options have removed this setting. As far as I understand, this setting causes us to overwrite any preexisting config values when reconfiguring Scalar? So with your changes the effect is that we now don't do that anymore, which allows the user to tune some of the configuration values to their liking after having run `scalar init` for the first time. I guess that makes sense, as it gives the user more flexibility. It does make me wonder though: is it really the most sensible thing to overwrite any keys that already exist in the configuration? We may end up overwriting configuration specified by the user both in the case of `scalar init` and `scalar reconfigure`. But arguably, we might want to only ever write configuration that does _not_ yet have an explicit value in the configuration file, regardless of whether or not we reconfigure. > However, there is one setting that still has this, which is > index.skipHash, which was previously being set to _false_ when we > actually prefer the value of true. Keep the overwrite here to help > Scalar users upgrade to the new version. We may remove that overwrite in > the future once we belive that most of the users who have the false > value have upgraded to a version that overwrites that to 'true'. Makes sense. This has likely been a bug, and we now want to rectify that bug. Thanks! Patrick

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote: > On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote: >> From: Derrick Stolee <[email protected]> >> >> The config values set by Scalar went through an audit in the previous >> changes, so now reorganize the settings and simplify their purpose. >> >> First, alphabetize the config options, except put the platform-specific >> options at the end. This groups two Windows-specific settings and only >> one non-Windows setting. >> >> Also, this removes the 'overwrite_on_reconfigure' setting for many of >> these options. That setting made nearly all of these options "required" >> for scalar enlistments, restricting use for users. Instead, now nearly >> all options have removed this setting. > > As far as I understand, this setting causes us to overwrite any > preexisting config values when reconfiguring Scalar? So with your > changes the effect is that we now don't do that anymore, which allows > the user to tune some of the configuration values to their liking after > having run `scalar init` for the first time. I guess that makes sense, > as it gives the user more flexibility. Yes, that is correct. > It does make me wonder though: is it really the most sensible thing to > overwrite any keys that already exist in the configuration? We may end > up overwriting configuration specified by the user both in the case of > `scalar init` and `scalar reconfigure`. But arguably, we might want to > only ever write configuration that does _not_ yet have an explicit value > in the configuration file, regardless of whether or not we reconfigure. I agree that this notion of forcing config is not optimal, and is a leftover from VFS for Git where some of these config things were actually required for the virtualization to work. Once that idea was in place, it was easy to think "we'll make sure the repo is configured correctly" but that makes much less sense in Scalar these days. >> However, there is one setting that still has this, which is >> index.skipHash, which was previously being set to _false_ when we >> actually prefer the value of true. Keep the overwrite here to help >> Scalar users upgrade to the new version. We may remove that overwrite in >> the future once we belive that most of the users who have the false >> value have upgraded to a version that overwrites that to 'true'. > > Makes sense. This has likely been a bug, and we now want to rectify that > bug. And hopefully this is the only reason we'd need this "overwrite" feature from this point on. Thanks, -Stolee

gitgitgadget · 2025-12-01T09:22:20Z

Documentation/scalar.adoc

 ~~~~~~

 delete <enlistment>::
 	This subcommand lets you delete an existing Scalar enlistment from your


On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote: > diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc > index f81b2832f8..b34af225e6 100644 > --- a/Documentation/scalar.adoc > +++ b/Documentation/scalar.adoc > @@ -197,6 +197,164 @@ delete <enlistment>:: > This subcommand lets you delete an existing Scalar enlistment from your > local file system, unregistering the repository. > > +REQUIRED AND RECOMMENDED CONFIG > +------------------------------- > + > +As part of both `scalar clone` and `scalar register`, certain Git config > +values are set to optimize for large repositories or cross-platform support. > +These options are updated in new Git versions according to the best known > +advice for large repositories, and users can get the latest recommendations > +by running `scalar reconfigure [--all]`. > + > +This section lists justifications for the config values that are set in the > +latest version. > + > +am.keepCR=true:: > + This setting is important for cross-platform development across Windows > + and non-Windows platforms and keeping carriage return (`\r`) characters > + in certain workflows. > + > +commitGraph.changedPaths=true:: > + This setting helps the background maintenance steps that compute the > + serialized commit-graph to also store changed-path Bloom filters. This > + accelerates file history commands and allows users to automatically > + benefit without running a foreground command. Is this something we also want to promote to "default" eventually? The downside of course is that maintenance takes a bit longer, but given that it runs in the background anyway this shouldn't really impact our users all that much. > +commitGraph.generationVersion=1:: > + While the preferred version is 2 for performance reasons, existing users > + that had version 1 by default will need special care in upgrading to > + version 2. This is likely to change in the future as the upgrade story > + is solidifies. Is that still the case? We _did_ have some bugs in the upgrade path in the past, but I thought it got all sorted out by now? [snip] > +fetch.unpackLimit=1:: > + This setting prevents Git from unpacking packfiles into loose objects > + as they are downloaded from the server. This feature was intended as a > + way to prevent performance issues from too many packfiles, but Scalar > + uses background maintenance to group packfiles and cover them with a > + multi-pack-index, removing this issue. The second sentence here reads as if "fetch.unpackLimit=1" was the feature you are talking about, which led to some puzzlement at first. But what you are talking about is the _default_ unpack limit of 100. Maybe something like this reads better? This setting prevents Git from unpacking packfiles into loose objects as they are downloaded from the server. The default limit of 100 objects was intended as a way to prevent performance issues from too many packfiles, but Scalar uses background maintenance to group packfiles and cover them with a multi-pack-index, removing this issue. Patrick

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote: > On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote: >> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc >> index f81b2832f8..b34af225e6 100644 >> --- a/Documentation/scalar.adoc >> +++ b/Documentation/scalar.adoc >> @@ -197,6 +197,164 @@ delete <enlistment>:: >> This subcommand lets you delete an existing Scalar enlistment from your >> local file system, unregistering the repository. >> >> +REQUIRED AND RECOMMENDED CONFIG >> +------------------------------- >> + >> +As part of both `scalar clone` and `scalar register`, certain Git config >> +values are set to optimize for large repositories or cross-platform support. >> +These options are updated in new Git versions according to the best known >> +advice for large repositories, and users can get the latest recommendations >> +by running `scalar reconfigure [--all]`. >> + >> +This section lists justifications for the config values that are set in the >> +latest version. >> + >> +am.keepCR=true:: >> + This setting is important for cross-platform development across Windows >> + and non-Windows platforms and keeping carriage return (`\r`) characters >> + in certain workflows. >> + >> +commitGraph.changedPaths=true:: >> + This setting helps the background maintenance steps that compute the >> + serialized commit-graph to also store changed-path Bloom filters. This >> + accelerates file history commands and allows users to automatically >> + benefit without running a foreground command. > > Is this something we also want to promote to "default" eventually? The > downside of course is that maintenance takes a bit longer, but given > that it runs in the background anyway this shouldn't really impact our > users all that much. I'm not sure, as this is a significant cost to the computation time. It will impact foreground commands, as well. It increases the size of the file, too. It's worth considering, but I don't think the answer is very simple. >> +commitGraph.generationVersion=1:: >> + While the preferred version is 2 for performance reasons, existing users >> + that had version 1 by default will need special care in upgrading to >> + version 2. This is likely to change in the future as the upgrade story >> + is solidifies. > > Is that still the case? We _did_ have some bugs in the upgrade path in > the past, but I thought it got all sorted out by now? This is very likely, but I haven't validated myself. I'd be interested to double-check and update this setting in a later series. If we update to 2, then this would be a good reason to overwrite the old config for a while. > [snip] >> +fetch.unpackLimit=1:: >> + This setting prevents Git from unpacking packfiles into loose objects >> + as they are downloaded from the server. This feature was intended as a >> + way to prevent performance issues from too many packfiles, but Scalar >> + uses background maintenance to group packfiles and cover them with a >> + multi-pack-index, removing this issue. > > The second sentence here reads as if "fetch.unpackLimit=1" was the > feature you are talking about, which led to some puzzlement at first. > But what you are talking about is the _default_ unpack limit of 100. > Maybe something like this reads better? > > This setting prevents Git from unpacking packfiles into loose objects > as they are downloaded from the server. The default limit of 100 > objects was intended as a way to prevent performance issues from too > many packfiles, but Scalar uses background maintenance to group > packfiles and cover them with a multi-pack-index, removing this > issue. Good catch. Thanks! -Stolee

A repo may have config options set by 'scalar clone' or 'scalar register' and then updated by 'scalar reconfigure'. It can be helpful to point out which of those options were set by the latest scalar recommendations. Add "# set by scalar" to the end of each config option to assist users in identifying why these config options were set in their repo. Use a new helper method to simplify the two callsites. Co-authored-by: Patrick Steinhardt <[email protected]> Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Derrick Stolee <[email protected]>

The index.skipHash config option has been set to 'false' by Scalar since 4933152 (scalar: enable path-walk during push via config, 2025-05-16) but that commit message is trying to communicate the exact opposite: that the 'true' value is what we want instead. This means that we've been disabling this performance benefit for Scalar repos unintentionally. Fix this issue before we add justification for the config options set in this list. Oddly, enabling index.skipHash causes a test issue during 'test_commit' in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as caught by the linux-test-vars build). I'm fixing the test by disabling the environment variable, but the issue should be resolved in a series focused on the split index. Signed-off-by: Derrick Stolee <[email protected]>

These config values were added in the original Scalar contribution, d0feac4 (scalar: 'register' sets recommended config and starts maintenance, 2021-12-03), but were never fully checked for validity in the upstream Git project. At the time, Scalar was only intended for the contrib/ directory so did not have as rigorous of an investigation. Each config option has its own justification for removal: * core.preloadIndex: This value is true by default, now. Removing this causes some changes required to the tests that checked this config value. Use gui.gcwarning=false instead. * core.fscache: This config does not exist in the core Git project, but is instead a config option for a Git for Windows feature. * core.multiPackIndex: This config value is now enabled by default, so does not need to be called out specifically. It was originally included to make sure the background maintenance that created multi-pack-indexes would result in the expected performance improvements. * credential.validate: This option is not something specific to Git but instead an older version of Git Credential Manager for Windows. That software was replaced several years ago by the cross-platform Git Credential Manger so this option is no longer needed to help users who were on that older software. * pack.useSparse=true: This value is now Git's default as of de3a864 (config: set pack.useSparse=true by default, 2020-03-20) so we don't need it set by Scalar. Signed-off-by: Derrick Stolee <[email protected]>

The config values set by Scalar went through an audit in the previous changes, so now reorganize the settings and simplify their purpose. First, alphabetize the config options, except put the platform-specific options at the end. This groups two Windows-specific settings and only one non-Windows setting. Also, this removes the 'overwrite_on_reconfigure' setting for many of these options. That setting made nearly all of these options "required" for scalar enlistments, restricting use for users. Instead, now nearly all options have removed this setting. However, there is one setting that still has this, which is index.skipHash, which was previously being set to _false_ when we actually prefer the value of true. Keep the overwrite here to help Scalar users upgrade to the new version. We may remove that overwrite in the future once we belive that most of the users who have the false value have upgraded to a version that overwrites that to 'true'. Signed-off-by: Derrick Stolee <[email protected]>

Add user-facing documentation that justifies the values being set by 'scalar clone', 'scalar register', and 'scalar reconfigure'. Helped-by: Junio C Hamano <[email protected]> Helped-by: Patrick Steinhardt <[email protected]> Signed-off-by: Derrick Stolee <[email protected]>

gitgitgadget · 2025-12-01T14:26:55Z

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Stolee,

On Wed, 26 Nov 2025, Derrick Stolee via GitGitGadget wrote:

> In September [1], we discussed that the Scalar config options could use some
> documented justification as well as some comments to the config file that
> they were set by Scalar. I was then immediately distracted by other work
> things and am finally here with a series to do just that.

Thank you for doing this, in particular the (quite long!) list of
explanations are excellent, especially when some user wonders why a
particular setting was chosen and wants to understand the reason.

> 
> [1]
> https://lore.kernel.org/git/[email protected]/
> 
> I have indeed used Patrick's idea to add '# set by scalar' to each line
> added by Scalar, it took a little more work for all the kinds of config set.

I am glad that the work I put in to optionally add comments pays off.

It's a bit sad that there is no well-designed bulk-edit "API" function
which therefore requires constructing and `free()`ing that `file` variable
many times, but that's not the fault of this series.

> I made myself a co-author.
> 
> While working to justify each config option, I found some stale or incorrect
> config options. I also relaxed the override setting in most cases which gave
> me an opportunity to alphabetize the settings.
> 
> There was at least one case (I'm thinking of core.fscache here) where the
> config doesn't even exist in core Git, but instead in Git for Windows. We'll
> need to adjust in that fork to reinclude it in the right place.

Thank you for calling this out! I will take care of this in Git for
Windows and also in Microsoft Git (which inherits this flag from Git for
Windows).

To be honest, I am not so certain that we want the FSCache to be enabled,
it does have long-standing bugs (introduced by the partial clone feature,
for example, where the FSCache continues to retain stale information about
which loose objects are present even after the missing ones have been
fetched). I guess we'll have to measure the actual performance benefits to
reassess whether the feature is worth the trouble.

Thank you for your diligent work, as always,
Johannes

> 
> Thanks, -Stolee
> 
> Derrick Stolee (5):
>   scalar: annotate config file with "set by scalar"
>   scalar: use index.skipHash=true for performance
>   scalar: remove stale config values
>   scalar: alphabetize and simplify config
>   scalar: document config settings
> 
>  Documentation/scalar.adoc | 158 ++++++++++++++++++++++++++++++++++++++
>  scalar.c                  |  81 ++++++++++---------
>  t/t9210-scalar.sh         |  26 ++++---
>  3 files changed, 218 insertions(+), 47 deletions(-)
> 
> 
> base-commit: 6ab38b7e9cc7adafc304f3204616a4debd49c6e9
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2010%2Fderrickstolee%2Fscalar-config-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2010/derrickstolee/scalar-config-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/2010
> -- 
> gitgitgadget
>

derrickstolee · 2025-12-01T16:49:42Z

/submit

gitgitgadget · 2025-12-01T16:51:30Z

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2010/derrickstolee/scalar-config-v2

To fetch this version to local tag pr-2010/derrickstolee/scalar-config-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2010/derrickstolee/scalar-config-v2

gitgitgadget · 2025-12-01T17:50:21Z

scalar.c


 static int set_recommended_config(int reconfigure)
 {
 	struct scalar_config config[] = {


On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote: > * core.preloadIndex: This value is true by default, now. Removing this > causes some changes required to the tests that checked this config > value. Use gui.gcwarning=false instead. I was going to ask about if we could also rely on the default value of index.threads like we do here, but then went and did some reading and realised some config values, like index.recordOffsetTable, have their value set according to whether index.threads was explicitly set, so I guess there's an implicit reliance on that behaviour that we want to keep? > * core.fscache: This config does not exist in the core Git project, but > is instead a config option for a Git for Windows feature. > > * core.multiPackIndex: This config value is now enabled by default, so > does not need to be called out specifically. It was originally > included to make sure the background maintenance that created > multi-pack-indexes would result in the expected performance > improvements. > > * credential.validate: This option is not something specific to Git but > instead an older version of Git Credential Manager for Windows. That > software was replaced several years ago by the cross-platform Git > Credential Manger so this option is no longer needed to help users who > were on that older software. > > * pack.useSparse=true: This value is now Git's default as of de3a864114 > (config: set pack.useSparse=true by default, 2020-03-20) so we don't > need it set by Scalar. Thanks for the detail on all of these, very helpful

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:46:46PM +0000, Matthew Hughes wrote: > On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote: > > * core.preloadIndex: This value is true by default, now. Removing this > > causes some changes required to the tests that checked this config > > value. Use gui.gcwarning=false instead. > > I was going to ask about if we could also rely on the default value of > index.threads like we do here, but then went and did some reading and realised > some config values, like index.recordOffsetTable, have their value set > according to whether index.threads was explicitly set, so I guess there's an > implicit reliance on that behaviour that we want to keep? Wait. Are you saying that "index.recordOffsetTable" behaves differently based on whether "index.threads" is implicitly enabled due to the default value or explicitly enabled via the configuration? If so, that smells like a plain bug to me. Patrick

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote: > Wait. Are you saying that "index.recordOffsetTable" behaves differently > based on whether "index.threads" is implicitly enabled due to the > default value or explicitly enabled via the configuration? That was my understanding from a cursory read of the results of searching for 'index.threads' in git-config: > index.recordEndOfIndexEntries > ... > Defaults to true if index.threads has been explicitly enabled, false > otherwise

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Dec 02, 2025 at 07:04:24PM +0000, Matthew Hughes wrote: > On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote: > > Wait. Are you saying that "index.recordOffsetTable" behaves differently > > based on whether "index.threads" is implicitly enabled due to the > > default value or explicitly enabled via the configuration? > > That was my understanding from a cursory read of the results of searching for > 'index.threads' in git-config: > > > index.recordEndOfIndexEntries > > ... > > Defaults to true if index.threads has been explicitly enabled, false > > otherwise Hm, true. At least that's a concious decision then. The logic around this was introduced in 2a9dedef2e (index: make index.threads=true enable ieot and eoie, 2018-11-19), and the ultimate reason for it seems to be backwards compatibility: index.threads and index.recordOffsetTable unspecified: do not write the offset table yet (to avoid alarming the user with "ignoring IEOT extension" messages when an older version of Git accesses the repository) but do make use of multiple threads to read the index if the supporting offset table is present. Older versions of Git complained when they see unknown extensions, and we didn't want to expose users to such warnings. That makes me wonder whether it's time now to revisit that decision -- it's been 7 years since then, I guess that many clients nowadays would understand the extension. The only (documented) downside should thus not be that important anymore, but the upside is that reading the index would be faster if we default-enable writing the extension. Patrick

gitgitgadget · 2025-12-01T17:50:23Z

User Matthew Hughes <[email protected]> has been added to the cc: list.

gitgitgadget · 2025-12-01T18:00:46Z

Documentation/scalar.adoc

 ~~~~~~

 delete <enlistment>::
 	This subcommand lets you delete an existing Scalar enlistment from your


On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote: > Add user-facing documentation that justifies the values being set by > 'scalar clone', 'scalar register', and 'scalar reconfigure'. Thanks! This is exactly what I was hoping for. > +REQUIRED AND RECOMMENDED CONFIG > +------------------------------- Would it be worth noting in scalar.c that the config options listed there are documented here, So that a dev changing the list in the source will know to also update this? I assume there's an understanding that if e.g. you update a flag you should know to also update relevant docs, but perhaps this is a bit more niche. > +gc.auto=0:: > + This disables automatic garbage collection, since Scalar uses background > + maintenance to keep the repository data in good shape. Checking my understanding: this means there will be _no_ automatic GC in a scalar repo? Since scalar calls 'maintenance register' which means maintenance.strategy will be set to 'incremental' which won't schedule any gc runs

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:58:06PM +0000, Matthew Hughes wrote: > On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote: > > Add user-facing documentation that justifies the values being set by > > 'scalar clone', 'scalar register', and 'scalar reconfigure'. > > Thanks! This is exactly what I was hoping for. > > > +REQUIRED AND RECOMMENDED CONFIG > > +------------------------------- > > Would it be worth noting in scalar.c that the config options listed there are > documented here, So that a dev changing the list in the source will know to > also update this? I assume there's an understanding that if e.g. you update a > flag you should know to also update relevant docs, but perhaps this is a bit > more niche. > > > +gc.auto=0:: > > + This disables automatic garbage collection, since Scalar uses background > > + maintenance to keep the repository data in good shape. > > Checking my understanding: this means there will be _no_ automatic GC in a > scalar repo? Since scalar calls 'maintenance register' which means > maintenance.strategy will be set to 'incremental' which won't schedule any gc > runs Yes, auto-garbage-collection is completely disabled in repositories managed by Scalar. And I guess that made sense in the past: auto-maintenance did not know about maintenance strategies at all, and consequently it would still run git-gc(1). And that's not really compatible with the "incremental" strategy that Scalar wants to use. I changed that in Git 2.52 so that maintenance strategies now apply to both scheduled and normal maintenance. But I was worried about backwards compatibility for the "incremental" strategy, so I made the change in a backwards compatible way so that normal maintenance still ends up using git-gc(1). Arguably though, we can now iterate on our infrastructure: if we were to introduce an "incremental-v2" strategy we could adapt it to have proper strategies for both scheduled and normal maintenance. And if so, we can adapt Scalar in such a way that it doesn't have to disable auto maintenance anymore. I think that would be a reasonable thing to do. Scheduled maintenance only runs once per hour, and in a high-activity repo a user may easily generate tons of objects in that hour that make the repository perform badly. Patrick

gitgitgadget · 2025-12-02T03:34:35Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

>      -@@ scalar.c: static int set_scalar_config(const struct scalar_config *config, int reconfigure
>      +@@ scalar.c: struct scalar_config {
>      + 	int overwrite_on_reconfigure;
>      + };
>      + 
>      ++static int set_config_with_comment(const char *key, const char *value)

I do not care too deeply as this is a file-scope static that is
called only twice, but I would have preferred scalar_set_config()
which is a lot more specificto the purpose of this function (and the
comment "# set by scalar" is hardcoded constant in this function
that its callers cannot affect, so "with_comment" is not even a
statement that "the callers can add comment to their config
settings") which would have taken a bit shorter line to call.

>       +fetch.unpackLimit=1::
>       +	This setting prevents Git from unpacking packfiles into loose objects
>      -+	as they are downloaded from the server. This feature was intended as a
>      -+	way to prevent performance issues from too many packfiles, but Scalar
>      -+	uses background maintenance to group packfiles and cover them with a
>      -+	multi-pack-index, removing this issue.
>      ++	as they are downloaded from the server. The default limit of 100 was
>      ++	intended as a way to prevent performance issues from too many packfiles,
>      ++	but Scalar uses background maintenance to group packfiles and cover them
>      ++	with a multi-pack-index, removing this issue.

Nicely explained.

Will replace (when I land).

Thanks.

gitgitgadget · 2025-12-02T07:57:52Z

scalar.c

 #include "refs.h"
 #include "dir.h"
 #include "packfile.h"
 #include "help.h"


On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:43PM +0000, Derrick Stolee via GitGitGadget wrote: > diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh > index bd6f0c40d2..43c210a23d 100755 > --- a/t/t9210-scalar.sh > +++ b/t/t9210-scalar.sh > @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' ' > GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a && > test_path_is_file one/src/cron.txt && > test true = "$(git -C one/src config core.preloadIndex)" && > + test_grep "preloadIndex = true # set by scalar" one/src/.git/config && > + test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config && > + > test_subcommand git maintenance start <reconfigure && > test_subcommand ! git maintenance unregister --force <reconfigure && We _could_ make this a bit more solid by adding a test that: 1. Initializes a new repository. 2. Saves the configuration. 3. Performs `scalar reconfigure`. 4. Asserts that all new non-section-header lines in the configuration have a trailing "#set by scalar" comment. This would ensure that there is no callsite we forgot to add the new annotation to, and that there are new future callsites where somebody isn't aware of the comments. I don't insist on such a test though, so please feel free to ignore this suggestion. Patrick

derrickstolee self-assigned this Nov 26, 2025

derrickstolee force-pushed the scalar-config branch 2 times, most recently from f38ead9 to 18580f0 Compare November 26, 2025 19:58

gitgitgadget bot reviewed Nov 26, 2025

View reviewed changes

gitgitgadget bot reviewed Nov 27, 2025

View reviewed changes

gitgitgadget bot added the seen label Nov 29, 2025

gitgitgadget bot reviewed Dec 1, 2025

View reviewed changes

derrickstolee and others added 4 commits December 1, 2025 07:30

derrickstolee force-pushed the scalar-config branch from 18580f0 to 70bdcf7 Compare December 1, 2025 12:41

derrickstolee force-pushed the scalar-config branch from 70bdcf7 to ac1627d Compare December 1, 2025 14:10

gitgitgadget bot reviewed Dec 1, 2025

View reviewed changes

gitgitgadget bot reviewed Dec 2, 2025

View reviewed changes

Audit and document Scalar config #2010

Are you sure you want to change the base?

Audit and document Scalar config #2010

Conversation

derrickstolee commented Nov 26, 2025 • edited by gitgitgadget bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updates in V2

Uh oh!

derrickstolee commented Nov 26, 2025

Uh oh!

gitgitgadget bot commented Nov 26, 2025

Uh oh!

gitgitgadget bot Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Nov 29, 2025

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

derrickstolee commented Dec 1, 2025

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Dec 1, 2025

Uh oh!

gitgitgadget bot Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Dec 2, 2025

derrickstolee commented Nov 26, 2025 •

edited by gitgitgadget bot

Loading