Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 158 additions & 0 deletions Documentation/scalar.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,164 @@ delete <enlistment>::
This subcommand lets you delete an existing Scalar enlistment from your
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> +commitGraph.generationVersion=1::
> +	While the preferred version is 2 for performance reasons, existing users
> +	that had version 1 by default will need special care in upgrading to
> +	version 2. This is likely to change in the future as the upgrade story
> +	is solidifies.

"as the upgrade story solidifies"?

> +fetch.writeCommitGraph=false::
> +	This config setting was created to help users automatically udpate their
> +	commit-graph files as they perform fetches. However, this takes time
> +	from foreground fetches and pulls and Scalar uses background maintenance
> +	for this function instead.

"update their files".

> +index.threads=true::
> +	This tells Git to automatically detect how many threads it should use
> +	when reading the index in parallel due to the `core.preloadIndex=true`
> +	setting.

Is "due to the `core.preloadIndex=true` setting" part of this
sentence still relevant?


Other than that, superbly written.  Thanks, will queue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 11/26/2025 7:09 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
>> +commitGraph.generationVersion=1::
>> +	While the preferred version is 2 for performance reasons, existing users
>> +	that had version 1 by default will need special care in upgrading to
>> +	version 2. This is likely to change in the future as the upgrade story
>> +	is solidifies.
> 
> "as the upgrade story solidifies"?

That's better than what I was going for which was "is solidified". Will fix.

>> +fetch.writeCommitGraph=false::
>> +	This config setting was created to help users automatically udpate their
>> +	commit-graph files as they perform fetches. However, this takes time
>> +	from foreground fetches and pulls and Scalar uses background maintenance
>> +	for this function instead.
> 
> "update their files".

Yes. thanks.

>> +index.threads=true::
>> +	This tells Git to automatically detect how many threads it should use
>> +	when reading the index in parallel due to the `core.preloadIndex=true`
>> +	setting.
> 
> Is "due to the `core.preloadIndex=true` setting" part of this
> sentence still relevant?

I should still include this, but mention that it is enabled by default and
still recommended.

> Other than that, superbly written.  Thanks, will queue.

Thanks,
-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
> index f81b2832f8..b34af225e6 100644
> --- a/Documentation/scalar.adoc
> +++ b/Documentation/scalar.adoc
> @@ -197,6 +197,164 @@ delete <enlistment>::
>  	This subcommand lets you delete an existing Scalar enlistment from your
>  	local file system, unregistering the repository.
>  
> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------
> +
> +As part of both `scalar clone` and `scalar register`, certain Git config
> +values are set to optimize for large repositories or cross-platform support.
> +These options are updated in new Git versions according to the best known
> +advice for large repositories, and users can get the latest recommendations
> +by running `scalar reconfigure [--all]`.
> +
> +This section lists justifications for the config values that are set in the
> +latest version.
> +
> +am.keepCR=true::
> +	This setting is important for cross-platform development across Windows
> +	and non-Windows platforms and keeping carriage return (`\r`) characters
> +	in certain workflows.
> +
> +commitGraph.changedPaths=true::
> +	This setting helps the background maintenance steps that compute the
> +	serialized commit-graph to also store changed-path Bloom filters. This
> +	accelerates file history commands and allows users to automatically
> +	benefit without running a foreground command.

Is this something we also want to promote to "default" eventually? The
downside of course is that maintenance takes a bit longer, but given
that it runs in the background anyway this shouldn't really impact our
users all that much.

> +commitGraph.generationVersion=1::
> +	While the preferred version is 2 for performance reasons, existing users
> +	that had version 1 by default will need special care in upgrading to
> +	version 2. This is likely to change in the future as the upgrade story
> +	is solidifies.

Is that still the case? We _did_ have some bugs in the upgrade path in
the past, but I thought it got all sorted out by now?

[snip]
> +fetch.unpackLimit=1::
> +	This setting prevents Git from unpacking packfiles into loose objects
> +	as they are downloaded from the server. This feature was intended as a
> +	way to prevent performance issues from too many packfiles, but Scalar
> +	uses background maintenance to group packfiles and cover them with a
> +	multi-pack-index, removing this issue.

The second sentence here reads as if "fetch.unpackLimit=1" was the
feature you are talking about, which led to some puzzlement at first.
But what you are talking about is the _default_ unpack limit of 100.
Maybe something like this reads better?

    This setting prevents Git from unpacking packfiles into loose objects
    as they are downloaded from the server. The default limit of 100
    objects was intended as a way to prevent performance issues from too
    many packfiles, but Scalar uses background maintenance to group
    packfiles and cover them with a multi-pack-index, removing this
    issue.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
>> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
>> index f81b2832f8..b34af225e6 100644
>> --- a/Documentation/scalar.adoc
>> +++ b/Documentation/scalar.adoc
>> @@ -197,6 +197,164 @@ delete <enlistment>::
>>   	This subcommand lets you delete an existing Scalar enlistment from your
>>   	local file system, unregistering the repository.
>>   >> +REQUIRED AND RECOMMENDED CONFIG
>> +-------------------------------
>> +
>> +As part of both `scalar clone` and `scalar register`, certain Git config
>> +values are set to optimize for large repositories or cross-platform support.
>> +These options are updated in new Git versions according to the best known
>> +advice for large repositories, and users can get the latest recommendations
>> +by running `scalar reconfigure [--all]`.
>> +
>> +This section lists justifications for the config values that are set in the
>> +latest version.
>> +
>> +am.keepCR=true::
>> +	This setting is important for cross-platform development across Windows
>> +	and non-Windows platforms and keeping carriage return (`\r`) characters
>> +	in certain workflows.
>> +
>> +commitGraph.changedPaths=true::
>> +	This setting helps the background maintenance steps that compute the
>> +	serialized commit-graph to also store changed-path Bloom filters. This
>> +	accelerates file history commands and allows users to automatically
>> +	benefit without running a foreground command.
> > Is this something we also want to promote to "default" eventually? The
> downside of course is that maintenance takes a bit longer, but given
> that it runs in the background anyway this shouldn't really impact our
> users all that much.

I'm not sure, as this is a significant cost to the computation time. It will
impact foreground commands, as well. It increases the size of the file, too.

It's worth considering, but I don't think the answer is very simple.

>> +commitGraph.generationVersion=1::
>> +	While the preferred version is 2 for performance reasons, existing users
>> +	that had version 1 by default will need special care in upgrading to
>> +	version 2. This is likely to change in the future as the upgrade story
>> +	is solidifies.
> > Is that still the case? We _did_ have some bugs in the upgrade path in
> the past, but I thought it got all sorted out by now?

This is very likely, but I haven't validated myself. I'd be interested to
double-check and update this setting in a later series. If we update to 2,
then this would be a good reason to overwrite the old config for a while.

> [snip]
>> +fetch.unpackLimit=1::
>> +	This setting prevents Git from unpacking packfiles into loose objects
>> +	as they are downloaded from the server. This feature was intended as a
>> +	way to prevent performance issues from too many packfiles, but Scalar
>> +	uses background maintenance to group packfiles and cover them with a
>> +	multi-pack-index, removing this issue.
> > The second sentence here reads as if "fetch.unpackLimit=1" was the
> feature you are talking about, which led to some puzzlement at first.
> But what you are talking about is the _default_ unpack limit of 100.
> Maybe something like this reads better?
> >      This setting prevents Git from unpacking packfiles into loose objects
>      as they are downloaded from the server. The default limit of 100
>      objects was intended as a way to prevent performance issues from too
>      many packfiles, but Scalar uses background maintenance to group
>      packfiles and cover them with a multi-pack-index, removing this
>      issue.

Good catch. Thanks!

-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> Add user-facing documentation that justifies the values being set by
> 'scalar clone', 'scalar register', and 'scalar reconfigure'.

Thanks! This is exactly what I was hoping for.

> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------

Would it be worth noting in scalar.c that the config options listed there are
documented here, So that a dev changing the list in the source will know to
also update this? I assume there's an understanding that if e.g. you update a
flag you should know to also update relevant docs, but perhaps this is a bit
more niche.

> +gc.auto=0::
> +	This disables automatic garbage collection, since Scalar uses background
> +	maintenance to keep the repository data in good shape.

Checking my understanding: this means there will be _no_ automatic GC in a
scalar repo? Since scalar calls 'maintenance register' which means
maintenance.strategy will be set to 'incremental' which won't schedule any gc
runs

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:58:06PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> > Add user-facing documentation that justifies the values being set by
> > 'scalar clone', 'scalar register', and 'scalar reconfigure'.
> 
> Thanks! This is exactly what I was hoping for.
> 
> > +REQUIRED AND RECOMMENDED CONFIG
> > +-------------------------------
> 
> Would it be worth noting in scalar.c that the config options listed there are
> documented here, So that a dev changing the list in the source will know to
> also update this? I assume there's an understanding that if e.g. you update a
> flag you should know to also update relevant docs, but perhaps this is a bit
> more niche.
> 
> > +gc.auto=0::
> > +	This disables automatic garbage collection, since Scalar uses background
> > +	maintenance to keep the repository data in good shape.
> 
> Checking my understanding: this means there will be _no_ automatic GC in a
> scalar repo? Since scalar calls 'maintenance register' which means
> maintenance.strategy will be set to 'incremental' which won't schedule any gc
> runs

Yes, auto-garbage-collection is completely disabled in repositories
managed by Scalar. And I guess that made sense in the past:
auto-maintenance did not know about maintenance strategies at all, and
consequently it would still run git-gc(1). And that's not really
compatible with the "incremental" strategy that Scalar wants to use.

I changed that in Git 2.52 so that maintenance strategies now apply to
both scheduled and normal maintenance. But I was worried about backwards
compatibility for the "incremental" strategy, so I made the change in a
backwards compatible way so that normal maintenance still ends up using
git-gc(1).

Arguably though, we can now iterate on our infrastructure: if we were to
introduce an "incremental-v2" strategy we could adapt it to have proper
strategies for both scheduled and normal maintenance. And if so, we can
adapt Scalar in such a way that it doesn't have to disable auto
maintenance anymore.

I think that would be a reasonable thing to do. Scheduled maintenance
only runs once per hour, and in a high-activity repo a user may easily
generate tons of objects in that hour that make the repository perform
badly.

Patrick

local file system, unregistering the repository.

REQUIRED AND RECOMMENDED CONFIG
-------------------------------

As part of both `scalar clone` and `scalar register`, certain Git config
values are set to optimize for large repositories or cross-platform support.
These options are updated in new Git versions according to the best known
advice for large repositories, and users can get the latest recommendations
by running `scalar reconfigure [--all]`.

This section lists justifications for the config values that are set in the
latest version.

am.keepCR=true::
This setting is important for cross-platform development across Windows
and non-Windows platforms and keeping carriage return (`\r`) characters
in certain workflows.

commitGraph.changedPaths=true::
This setting helps the background maintenance steps that compute the
serialized commit-graph to also store changed-path Bloom filters. This
accelerates file history commands and allows users to automatically
benefit without running a foreground command.

commitGraph.generationVersion=1::
While the preferred version is 2 for performance reasons, existing users
that had version 1 by default will need special care in upgrading to
version 2. This is likely to change in the future as the upgrade story
solidifies.

core.autoCRLF=false::
This removes the transformation of worktree files to add CRLF line
endings when only LF line endings exist. This is removed for performance
reasons. Repositories that use tools that care about CRLF line endings
should commit the necessary files with those line endings instead.

core.logAllRefUpdates=true::
This enables the reflog on all branches. While this is a performance
cost for large repositories, it is frequently an important data source
for users to get out of bad situations or to seek support from experts.

core.safeCRLF=false::
Similar to `core.autoCRLF=false`, this disables checks around whether
the CRLF conversion is reversible. This is a performance improvement,
but can be dangerous if `core.autoCRLF` is reenabled by the user.

credential.https://dev.azure.com.useHttpPath=true::
This setting enables the `credential.useHttpPath` feature only for web
URLs for Azure DevOps. This is important for users interacting with that
service using multiple organizations and thus multiple credential
tokens.

feature.experimental=false::
This disables the "experimental" optimizations grouped under this
feature config. The expectation is that all valuable optimizations are
also set explicitly by Scalar config, and any differences are
intentional. Notable differences include several bitmap-related config
options which are disabled for client-focused Scalar repos.

feature.manyFiles=false::
This disables the "many files" optimizations grouped under this feature
config. The expectation is that all valuable optimizations are also set
explicitly by Scalar config, and any differences are intentional.

fetch.showForcedUpdates=false::
This disables the check at the end of `git fetch` that notifies the user
if the ref update was a forced update (one where the previous position
is not reachable from the latest position). This check can be very
expensive in large repositories, so is disabled and replaced with an
advice message. Set `advice.fetchShowForcedUpdates=false` to disable
this advice message.

fetch.unpackLimit=1::
This setting prevents Git from unpacking packfiles into loose objects
as they are downloaded from the server. The default limit of 100 was
intended as a way to prevent performance issues from too many packfiles,
but Scalar uses background maintenance to group packfiles and cover them
with a multi-pack-index, removing this issue.

fetch.writeCommitGraph=false::
This config setting was created to help users automatically update their
commit-graph files as they perform fetches. However, this takes time
from foreground fetches and pulls and Scalar uses background maintenance
for this function instead.

gc.auto=0::
This disables automatic garbage collection, since Scalar uses background
maintenance to keep the repository data in good shape.

gui.GCWarning=false::
Since Scalar disables garbage collection by setting `gc.auto=0`, the
`git-gui` tool may start to warn about this setting. Disable this
warning as Scalar's background maintenance configuration makes the
warning irrelevant.

index.skipHash=true::
Disable computing the hash of the index contents as it is being written.
This assists with performance, especially for large index files.

index.threads=true::
This tells Git to automatically detect how many threads it should use
when reading the index due to the default value of `core.preloadIndex`,
which enables parallel index reads.

index.version=4::
This index version adds compression to the path names, reducing the size
of the index in a significant way for large repos. This is an important
performance boost.

merge.renames=true::
When computing merges in large repos, it is particularly important to
detect renames to maximize the potential for a result that will validate
correctly. Users performing merges locally are more likely to be doing
so because a server-side merge (via pull request or similar) resulted in
conflicts. While this is the default setting, it is set specifically to
override a potential change to `diff.renames` which a user may set for
performance reasons.

merge.stat=false::
This disables a diff output after computing a merge. This improves
performance of `git merge` for large repos while reducing noisy output.

pack.useBitmaps=false::
This disables the use of `.bitmap` files attached to packfiles. Bitmap
files are optimized for server-side use, not client-side use. Scalar
disables this to avoid some performance issues that can occur if a user
accidentally creates `.bitmap` files.

pack.usePathWalk=true::
This enables the `--path-walk` option to `git pack-objects` by default.
This can accelerate the computation and compression of packfiles created
by `git push` and other repack operations.

receive.autoGC=false::
Similar to `gc.auto`, this setting is disabled in preference of
background maintenance.

status.aheadBehind=false::
This disables the ahead/behind calculation that would normally happen
during a `git status` command. This information is frequently ignored by
users but can be expensive to calculate in large repos that receive
thousands of commits per day. The calculation is replaced with an advice
message that can be disabled by disabling the `advice.statusAheadBehind`
config.

The following settings are different based on which platform is in use:

core.untrackedCache=(true|false)::
The untracked cache feature is important for performance benefits on
large repositories, but has demonstrated some bugs on Windows
filesystems. Thus, this is set for other platforms but disabled on
Windows.

http.sslBackend=schannel::
On Windows, the `openssl` backend has some issues with certain types of
remote providers and certificate types. Override the default setting to
avoid these common problems.


SEE ALSO
--------
linkgit:git-clone[1], linkgit:git-maintenance[1].
Expand Down
83 changes: 44 additions & 39 deletions scalar.c
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "help.h"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> Add "# set by scalar" to the end of each config option to assist users
> in identifying why these config options were set in their repo.

The implementation is quite straight-forward, inlining expansion of
repo_config_set_gently() in the places that we want to add comment to.

If we had (a lot) more than two callsites, I would have suggested to
add a simple helper function, something like

    static int scalar_config_set(struct repository *r, const char *key, const char *value)
    {
	char *file = repo_git_path(r, "config");
        int res = repo_config_set_multivar_in_file_gently(r, file,
		key, value, NULL, " # set by scalar", 0);
	free(file);
	return res;
    }

and then the updates to the callers would have been absolute minimum.

Well, even with only two callsites, perhaps such a refactoring may
still have value in reducing the risk of typo in the comment.

> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
>  	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
>  	test_path_is_file one/src/cron.txt &&
>  	test true = "$(git -C one/src config core.preloadIndex)" &&
> +	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> +	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
>  	test_subcommand git maintenance start <reconfigure &&
>  	test_subcommand ! git maintenance unregister --force <reconfigure &&

Looks good.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 03:55:10PM -0800, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
> > Add "# set by scalar" to the end of each config option to assist users
> > in identifying why these config options were set in their repo.
> 
> The implementation is quite straight-forward, inlining expansion of
> repo_config_set_gently() in the places that we want to add comment to.
> 
> If we had (a lot) more than two callsites, I would have suggested to
> add a simple helper function, something like
> 
>     static int scalar_config_set(struct repository *r, const char *key, const char *value)
>     {
> 	char *file = repo_git_path(r, "config");
>         int res = repo_config_set_multivar_in_file_gently(r, file,
> 		key, value, NULL, " # set by scalar", 0);
> 	free(file);
> 	return res;
>     }
> 
> and then the updates to the callers would have been absolute minimum.
> 
> Well, even with only two callsites, perhaps such a refactoring may
> still have value in reducing the risk of typo in the comment.

Agreed, I think it's a good idea to provide such a function. The calls
to `repo_config_set_multivar_in_file_gently()` are quite verbose.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:43PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
>  	GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
>  	test_path_is_file one/src/cron.txt &&
>  	test true = "$(git -C one/src config core.preloadIndex)" &&
> +	test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> +	test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
>  	test_subcommand git maintenance start <reconfigure &&
>  	test_subcommand ! git maintenance unregister --force <reconfigure &&

We _could_ make this a bit more solid by adding a test that:

  1. Initializes a new repository.

  2. Saves the configuration.

  3. Performs `scalar reconfigure`.

  4. Asserts that all new non-section-header lines in the configuration
     have a trailing "#set by scalar" comment.

This would ensure that there is no callsite we forgot to add the new
annotation to, and that there are new future callsites where somebody
isn't aware of the comments.

I don't insist on such a test though, so please feel free to ignore this
suggestion.

Patrick

#include "setup.h"
#include "trace2.h"
#include "path.h"

static void setup_enlistment_directory(int argc, const char **argv,
const char * const *usagestr,
Expand Down Expand Up @@ -95,6 +96,16 @@ struct scalar_config {
int overwrite_on_reconfigure;
};

static int set_config_with_comment(const char *key, const char *value)
{
char *file = repo_git_path(the_repository, "config");
int res = repo_config_set_multivar_in_file_gently(the_repository, file,
key, value, NULL,
" # set by scalar", 0);
free(file);
return res;
}

static int set_scalar_config(const struct scalar_config *config, int reconfigure)
{
char *value = NULL;
Expand All @@ -103,7 +114,7 @@ static int set_scalar_config(const struct scalar_config *config, int reconfigure
if ((reconfigure && config->overwrite_on_reconfigure) ||
repo_config_get_string(the_repository, config->key, &value)) {
trace2_data_string("scalar", the_repository, config->key, "created");
res = repo_config_set_gently(the_repository, config->key, config->value);
res = set_config_with_comment(config->key, config->value);
} else {
trace2_data_string("scalar", the_repository, config->key, "exists");
res = 0;
Expand All @@ -122,13 +133,33 @@ static int have_fsmonitor_support(void)
static int set_recommended_config(int reconfigure)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <[email protected]>
> 
> The config values set by Scalar went through an audit in the previous
> changes, so now reorganize the settings and simplify their purpose.
> 
> First, alphabetize the config options, except put the platform-specific
> options at the end. This groups two Windows-specific settings and only
> one non-Windows setting.
> 
> Also, this removes the 'overwrite_on_reconfigure' setting for many of
> these options. That setting made nearly all of these options "required"
> for scalar enlistments, restricting use for users. Instead, now nearly
> all options have removed this setting.

As far as I understand, this setting causes us to overwrite any
preexisting config values when reconfiguring Scalar? So with your
changes the effect is that we now don't do that anymore, which allows
the user to tune some of the configuration values to their liking after
having run `scalar init` for the first time. I guess that makes sense,
as it gives the user more flexibility.

It does make me wonder though: is it really the most sensible thing to
overwrite any keys that already exist in the configuration? We may end
up overwriting configuration specified by the user both in the case of
`scalar init` and `scalar reconfigure`. But arguably, we might want to
only ever write configuration that does _not_ yet have an explicit value
in the configuration file, regardless of whether or not we reconfigure.

> However, there is one setting that still has this, which is
> index.skipHash, which was previously being set to _false_ when we
> actually prefer the value of true. Keep the overwrite here to help
> Scalar users upgrade to the new version. We may remove that overwrite in
> the future once we belive that most of the users who have the false
> value have upgraded to a version that overwrites that to 'true'.

Makes sense. This has likely been a bug, and we now want to rectify that
bug.

Thanks!

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <[email protected]>
>>
>> The config values set by Scalar went through an audit in the previous
>> changes, so now reorganize the settings and simplify their purpose.
>>
>> First, alphabetize the config options, except put the platform-specific
>> options at the end. This groups two Windows-specific settings and only
>> one non-Windows setting.
>>
>> Also, this removes the 'overwrite_on_reconfigure' setting for many of
>> these options. That setting made nearly all of these options "required"
>> for scalar enlistments, restricting use for users. Instead, now nearly
>> all options have removed this setting.
> > As far as I understand, this setting causes us to overwrite any
> preexisting config values when reconfiguring Scalar? So with your
> changes the effect is that we now don't do that anymore, which allows
> the user to tune some of the configuration values to their liking after
> having run `scalar init` for the first time. I guess that makes sense,
> as it gives the user more flexibility.

Yes, that is correct.

> It does make me wonder though: is it really the most sensible thing to
> overwrite any keys that already exist in the configuration? We may end
> up overwriting configuration specified by the user both in the case of
> `scalar init` and `scalar reconfigure`. But arguably, we might want to
> only ever write configuration that does _not_ yet have an explicit value
> in the configuration file, regardless of whether or not we reconfigure.

I agree that this notion of forcing config is not optimal, and is a leftover
from VFS for Git where some of these config things were actually required
for the virtualization to work. Once that idea was in place, it was easy
to think "we'll make sure the repo is configured correctly" but that makes
much less sense in Scalar these days.

>> However, there is one setting that still has this, which is
>> index.skipHash, which was previously being set to _false_ when we
>> actually prefer the value of true. Keep the overwrite here to help
>> Scalar users upgrade to the new version. We may remove that overwrite in
>> the future once we belive that most of the users who have the false
>> value have upgraded to a version that overwrites that to 'true'.
> > Makes sense. This has likely been a bug, and we now want to rectify that
> bug.

And hopefully this is the only reason we'd need this "overwrite" feature
from this point on.

Thanks,
-Stolee

{
struct scalar_config config[] = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> From: Derrick Stolee <[email protected]>
>
> These config values were added in the original Scalar contribution,
> d0feac4e8c (scalar: 'register' sets recommended config and starts
> maintenance, 2021-12-03), but were never fully checked for validity in
> the upstream Git project. At the time, Scalar was only intended for the
> contrib/ directory so did not have as rigorous of an investigation.
>
> Each config option has its own justification for removal:
>
> * core.preloadIndex: This value is true by default, now. Removing this
>   causes some changes required to the tests that checked this config
>   value. Use gui.gcwarning=false instead.
>
> * core.fscache: This config does not exist in the core Git project, but
>   is instead a config option for a Git for Windows feature.
>
> * core.multiPackIndex: This config value is now enabled by default, so
>   does not need to be called out specifically. It was originally
>   included to make sure the background maintenance that created
>   multi-pack-indexes would result in the expected performance
>   improvements.
>
> * credential.validate: This option is not something specific to Git but
>   instead an older version of Git Credential Manager for Windows. That
>   software was replaced several years ago by the cross-platform Git
>   Credential Manger so this option is no longer needed to help users who
>   were on that older software.
>
> * pack.useSparse=true: This value is now Git's default as of de3a864114
>   (config: set pack.useSparse=true by default, 2020-03-20) so we don't
>   need it set by Scalar.

Thanks for a conprehensive list.  Very well described.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> * core.preloadIndex: This value is true by default, now. Removing this
>   causes some changes required to the tests that checked this config
>   value. Use gui.gcwarning=false instead.

I was going to ask about if we could also rely on the default value of
index.threads like we do here, but then went and did some reading and realised
some config values, like index.recordOffsetTable, have their value set
according to whether index.threads was explicitly set, so I guess there's an
implicit reliance on that behaviour that we want to keep?

> * core.fscache: This config does not exist in the core Git project, but
>   is instead a config option for a Git for Windows feature.
> 
> * core.multiPackIndex: This config value is now enabled by default, so
>   does not need to be called out specifically. It was originally
>   included to make sure the background maintenance that created
>   multi-pack-indexes would result in the expected performance
>   improvements.
> 
> * credential.validate: This option is not something specific to Git but
>   instead an older version of Git Credential Manager for Windows. That
>   software was replaced several years ago by the cross-platform Git
>   Credential Manger so this option is no longer needed to help users who
>   were on that older software.
> 
> * pack.useSparse=true: This value is now Git's default as of de3a864114
>   (config: set pack.useSparse=true by default, 2020-03-20) so we don't
>   need it set by Scalar.

Thanks for the detail on all of these, very helpful

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Dec 01, 2025 at 05:46:46PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> > * core.preloadIndex: This value is true by default, now. Removing this
> >   causes some changes required to the tests that checked this config
> >   value. Use gui.gcwarning=false instead.
> 
> I was going to ask about if we could also rely on the default value of
> index.threads like we do here, but then went and did some reading and realised
> some config values, like index.recordOffsetTable, have their value set
> according to whether index.threads was explicitly set, so I guess there's an
> implicit reliance on that behaviour that we want to keep?

Wait. Are you saying that "index.recordOffsetTable" behaves differently
based on whether "index.threads" is implicitly enabled due to the
default value or explicitly enabled via the configuration? If so, that
smells like a plain bug to me.

Patrick

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matthew Hughes wrote (reply to this):

On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> Wait. Are you saying that "index.recordOffsetTable" behaves differently
> based on whether "index.threads" is implicitly enabled due to the
> default value or explicitly enabled via the configuration?

That was my understanding from a cursory read of the results of searching for
'index.threads' in git-config:

> index.recordEndOfIndexEntries
> ...
> Defaults to true if index.threads has been explicitly enabled, false
> otherwise

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Dec 02, 2025 at 07:04:24PM +0000, Matthew Hughes wrote:
> On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> > Wait. Are you saying that "index.recordOffsetTable" behaves differently
> > based on whether "index.threads" is implicitly enabled due to the
> > default value or explicitly enabled via the configuration?
> 
> That was my understanding from a cursory read of the results of searching for
> 'index.threads' in git-config:
> 
> > index.recordEndOfIndexEntries
> > ...
> > Defaults to true if index.threads has been explicitly enabled, false
> > otherwise

Hm, true. At least that's a concious decision then.

The logic around this was introduced in 2a9dedef2e (index: make
index.threads=true enable ieot and eoie, 2018-11-19), and the ultimate
reason for it seems to be backwards compatibility:

    index.threads and index.recordOffsetTable unspecified: do not write
    the offset table yet (to avoid alarming the user with "ignoring IEOT
    extension" messages when an older version of Git accesses the
    repository) but do make use of multiple threads to read the index if
    the supporting offset table is present.

Older versions of Git complained when they see unknown extensions, and
we didn't want to expose users to such warnings. That makes me wonder
whether it's time now to revisit that decision -- it's been 7 years
since then, I guess that many clients nowadays would understand the
extension. 

The only (documented) downside should thus not be that important
anymore, but the upside is that reading the index would be faster if we
default-enable writing the extension.

Patrick

/* Required */
{ "am.keepCR", "true", 1 },
{ "core.FSCache", "true", 1 },
{ "core.multiPackIndex", "true", 1 },
{ "core.preloadIndex", "true", 1 },
{ "am.keepCR", "true" },
{ "commitGraph.changedPaths", "true" },
{ "commitGraph.generationVersion", "1" },
{ "core.autoCRLF", "false" },
{ "core.logAllRefUpdates", "true" },
{ "core.safeCRLF", "false" },
{ "credential.https://dev.azure.com.useHttpPath", "true" },
{ "feature.experimental", "false" },
{ "feature.manyFiles", "false" },
{ "fetch.showForcedUpdates", "false" },
{ "fetch.unpackLimit", "1" },
{ "fetch.writeCommitGraph", "false" },
{ "gc.auto", "0" },
{ "gui.GCWarning", "false" },
{ "index.skipHash", "true", 1 /* Fix previous setting. */ },
{ "index.threads", "true"},
{ "index.version", "4" },
{ "merge.renames", "true" },
{ "merge.stat", "false" },
{ "pack.useBitmaps", "false" },
{ "pack.usePathWalk", "true" },
{ "receive.autoGC", "false" },
{ "status.aheadBehind", "false" },

/* platform-specific */
#ifndef WIN32
{ "core.untrackedCache", "true", 1 },
{ "core.untrackedCache", "true" },
#else
/*
* Unfortunately, Scalar's Functional Tests demonstrated
Expand All @@ -142,36 +173,11 @@ static int set_recommended_config(int reconfigure)
* Therefore, with a sad heart, we disable this very useful
* feature on Windows.
*/
{ "core.untrackedCache", "false", 1 },
#endif
{ "core.logAllRefUpdates", "true", 1 },
{ "credential.https://dev.azure.com.useHttpPath", "true", 1 },
{ "credential.validate", "false", 1 }, /* GCM4W-only */
{ "gc.auto", "0", 1 },
{ "gui.GCWarning", "false", 1 },
{ "index.skipHash", "false", 1 },
{ "index.threads", "true", 1 },
{ "index.version", "4", 1 },
{ "merge.stat", "false", 1 },
{ "merge.renames", "true", 1 },
{ "pack.useBitmaps", "false", 1 },
{ "pack.useSparse", "true", 1 },
{ "receive.autoGC", "false", 1 },
{ "feature.manyFiles", "false", 1 },
{ "feature.experimental", "false", 1 },
{ "fetch.unpackLimit", "1", 1 },
{ "fetch.writeCommitGraph", "false", 1 },
#ifdef WIN32
{ "http.sslBackend", "schannel", 1 },
{ "core.untrackedCache", "false" },

/* Other Windows-specific required settings: */
{ "http.sslBackend", "schannel" },
#endif
/* Optional */
{ "status.aheadBehind", "false" },
{ "commitGraph.changedPaths", "true" },
{ "commitGraph.generationVersion", "1" },
{ "core.autoCRLF", "false" },
{ "core.safeCRLF", "false" },
{ "fetch.showForcedUpdates", "false" },
{ "pack.usePathWalk", "true" },
{ NULL, NULL },
};
int i;
Expand All @@ -197,9 +203,8 @@ static int set_recommended_config(int reconfigure)
if (repo_config_get_string(the_repository, "log.excludeDecoration", &value)) {
trace2_data_string("scalar", the_repository,
"log.excludeDecoration", "created");
if (repo_config_set_multivar_gently(the_repository, "log.excludeDecoration",
"refs/prefetch/*",
CONFIG_REGEX_NONE, 0))
if (set_config_with_comment("log.excludeDecoration",
"refs/prefetch/*"))
return error(_("could not configure "
"log.excludeDecoration"));
} else {
Expand Down
25 changes: 16 additions & 9 deletions t/t9210-scalar.sh
Original file line number Diff line number Diff line change
Expand Up @@ -202,14 +202,17 @@ test_expect_success 'scalar clone --no-... opts' '
test_expect_success 'scalar reconfigure' '
git init one/src &&
scalar register one &&
git -C one/src config core.preloadIndex false &&
git -C one/src config unset gui.gcwarning &&
scalar reconfigure one &&
test true = "$(git -C one/src config core.preloadIndex)" &&
git -C one/src config core.preloadIndex false &&
test false = "$(git -C one/src config gui.gcwarning)" &&
git -C one/src config unset gui.gcwarning &&
rm one/src/cron.txt &&
GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
test_path_is_file one/src/cron.txt &&
test true = "$(git -C one/src config core.preloadIndex)" &&
test false = "$(git -C one/src config gui.gcwarning)" &&
test_grep "GCWarning = false # set by scalar" one/src/.git/config &&
test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&

test_subcommand git maintenance start <reconfigure &&
test_subcommand ! git maintenance unregister --force <reconfigure &&

Expand All @@ -231,25 +234,29 @@ test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
git init $num/src &&
scalar register $num/src &&
git -C $num/src config includeif."onbranch:foo".path something &&
git -C $num/src config core.preloadIndex false || return 1
git -C $num/src config unset gui.gcwarning || return 1
done &&

scalar reconfigure --all &&

for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
test false = "$(git -C $num/src config gui.gcwarning)" || return 1
done
'

test_expect_success 'scalar reconfigure --all with detached HEADs' '
# This test demonstrates an issue with index.skipHash=true and
# this test variable for the split index. Disable the test variable.
sane_unset GIT_TEST_SPLIT_INDEX &&

repos="two three four" &&
for num in $repos
do
rm -rf $num/src &&
git init $num/src &&
scalar register $num/src &&
git -C $num/src config core.preloadIndex false &&
git -C $num/src config unset gui.gcwarning &&
test_commit -C $num/src initial &&
git -C $num/src switch --detach HEAD || return 1
done &&
Expand All @@ -258,7 +265,7 @@ test_expect_success 'scalar reconfigure --all with detached HEADs' '

for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
test false = "$(git -C $num/src config gui.gcwarning)" || return 1
done
'

Expand Down Expand Up @@ -290,7 +297,7 @@ test_expect_success 'scalar supports -c/-C' '
git init sub &&
scalar -C sub -c status.aheadBehind=bogus register &&
test -z "$(git -C sub config --local status.aheadBehind)" &&
test true = "$(git -C sub config core.preloadIndex)"
test false = "$(git -C sub config gui.gcwarning)"
'

test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
Expand Down
Loading