-
Notifications
You must be signed in to change notification settings - Fork 160
Audit and document Scalar config #2010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
639ff98
10e9548
8783db6
edc0254
ac1627d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -197,6 +197,164 @@ delete <enlistment>:: | |
| This subcommand lets you delete an existing Scalar enlistment from your | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
> index f81b2832f8..b34af225e6 100644
> --- a/Documentation/scalar.adoc
> +++ b/Documentation/scalar.adoc
> @@ -197,6 +197,164 @@ delete <enlistment>::
> This subcommand lets you delete an existing Scalar enlistment from your
> local file system, unregistering the repository.
>
> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------
> +
> +As part of both `scalar clone` and `scalar register`, certain Git config
> +values are set to optimize for large repositories or cross-platform support.
> +These options are updated in new Git versions according to the best known
> +advice for large repositories, and users can get the latest recommendations
> +by running `scalar reconfigure [--all]`.
> +
> +This section lists justifications for the config values that are set in the
> +latest version.
> +
> +am.keepCR=true::
> + This setting is important for cross-platform development across Windows
> + and non-Windows platforms and keeping carriage return (`\r`) characters
> + in certain workflows.
> +
> +commitGraph.changedPaths=true::
> + This setting helps the background maintenance steps that compute the
> + serialized commit-graph to also store changed-path Bloom filters. This
> + accelerates file history commands and allows users to automatically
> + benefit without running a foreground command.
Is this something we also want to promote to "default" eventually? The
downside of course is that maintenance takes a bit longer, but given
that it runs in the background anyway this shouldn't really impact our
users all that much.
> +commitGraph.generationVersion=1::
> + While the preferred version is 2 for performance reasons, existing users
> + that had version 1 by default will need special care in upgrading to
> + version 2. This is likely to change in the future as the upgrade story
> + is solidifies.
Is that still the case? We _did_ have some bugs in the upgrade path in
the past, but I thought it got all sorted out by now?
[snip]
> +fetch.unpackLimit=1::
> + This setting prevents Git from unpacking packfiles into loose objects
> + as they are downloaded from the server. This feature was intended as a
> + way to prevent performance issues from too many packfiles, but Scalar
> + uses background maintenance to group packfiles and cover them with a
> + multi-pack-index, removing this issue.
The second sentence here reads as if "fetch.unpackLimit=1" was the
feature you are talking about, which led to some puzzlement at first.
But what you are talking about is the _default_ unpack limit of 100.
Maybe something like this reads better?
This setting prevents Git from unpacking packfiles into loose objects
as they are downloaded from the server. The default limit of 100
objects was intended as a way to prevent performance issues from too
many packfiles, but Scalar uses background maintenance to group
packfiles and cover them with a multi-pack-index, removing this
issue.
PatrickThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Derrick Stolee wrote (reply to this): On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:36PM +0000, Derrick Stolee via GitGitGadget wrote:
>> diff --git a/Documentation/scalar.adoc b/Documentation/scalar.adoc
>> index f81b2832f8..b34af225e6 100644
>> --- a/Documentation/scalar.adoc
>> +++ b/Documentation/scalar.adoc
>> @@ -197,6 +197,164 @@ delete <enlistment>::
>> This subcommand lets you delete an existing Scalar enlistment from your
>> local file system, unregistering the repository.
>> >> +REQUIRED AND RECOMMENDED CONFIG
>> +-------------------------------
>> +
>> +As part of both `scalar clone` and `scalar register`, certain Git config
>> +values are set to optimize for large repositories or cross-platform support.
>> +These options are updated in new Git versions according to the best known
>> +advice for large repositories, and users can get the latest recommendations
>> +by running `scalar reconfigure [--all]`.
>> +
>> +This section lists justifications for the config values that are set in the
>> +latest version.
>> +
>> +am.keepCR=true::
>> + This setting is important for cross-platform development across Windows
>> + and non-Windows platforms and keeping carriage return (`\r`) characters
>> + in certain workflows.
>> +
>> +commitGraph.changedPaths=true::
>> + This setting helps the background maintenance steps that compute the
>> + serialized commit-graph to also store changed-path Bloom filters. This
>> + accelerates file history commands and allows users to automatically
>> + benefit without running a foreground command.
> > Is this something we also want to promote to "default" eventually? The
> downside of course is that maintenance takes a bit longer, but given
> that it runs in the background anyway this shouldn't really impact our
> users all that much.
I'm not sure, as this is a significant cost to the computation time. It will
impact foreground commands, as well. It increases the size of the file, too.
It's worth considering, but I don't think the answer is very simple.
>> +commitGraph.generationVersion=1::
>> + While the preferred version is 2 for performance reasons, existing users
>> + that had version 1 by default will need special care in upgrading to
>> + version 2. This is likely to change in the future as the upgrade story
>> + is solidifies.
> > Is that still the case? We _did_ have some bugs in the upgrade path in
> the past, but I thought it got all sorted out by now?
This is very likely, but I haven't validated myself. I'd be interested to
double-check and update this setting in a later series. If we update to 2,
then this would be a good reason to overwrite the old config for a while.
> [snip]
>> +fetch.unpackLimit=1::
>> + This setting prevents Git from unpacking packfiles into loose objects
>> + as they are downloaded from the server. This feature was intended as a
>> + way to prevent performance issues from too many packfiles, but Scalar
>> + uses background maintenance to group packfiles and cover them with a
>> + multi-pack-index, removing this issue.
> > The second sentence here reads as if "fetch.unpackLimit=1" was the
> feature you are talking about, which led to some puzzlement at first.
> But what you are talking about is the _default_ unpack limit of 100.
> Maybe something like this reads better?
> > This setting prevents Git from unpacking packfiles into loose objects
> as they are downloaded from the server. The default limit of 100
> objects was intended as a way to prevent performance issues from too
> many packfiles, but Scalar uses background maintenance to group
> packfiles and cover them with a multi-pack-index, removing this
> issue.
Good catch. Thanks!
-StoleeThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Matthew Hughes wrote (reply to this): On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> Add user-facing documentation that justifies the values being set by
> 'scalar clone', 'scalar register', and 'scalar reconfigure'.
Thanks! This is exactly what I was hoping for.
> +REQUIRED AND RECOMMENDED CONFIG
> +-------------------------------
Would it be worth noting in scalar.c that the config options listed there are
documented here, So that a dev changing the list in the source will know to
also update this? I assume there's an understanding that if e.g. you update a
flag you should know to also update relevant docs, but perhaps this is a bit
more niche.
> +gc.auto=0::
> + This disables automatic garbage collection, since Scalar uses background
> + maintenance to keep the repository data in good shape.
Checking my understanding: this means there will be _no_ automatic GC in a
scalar repo? Since scalar calls 'maintenance register' which means
maintenance.strategy will be set to 'incremental' which won't schedule any gc
runsThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Dec 01, 2025 at 05:58:06PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:47PM +0000, Derrick Stolee via GitGitGadget wrote:
> > Add user-facing documentation that justifies the values being set by
> > 'scalar clone', 'scalar register', and 'scalar reconfigure'.
>
> Thanks! This is exactly what I was hoping for.
>
> > +REQUIRED AND RECOMMENDED CONFIG
> > +-------------------------------
>
> Would it be worth noting in scalar.c that the config options listed there are
> documented here, So that a dev changing the list in the source will know to
> also update this? I assume there's an understanding that if e.g. you update a
> flag you should know to also update relevant docs, but perhaps this is a bit
> more niche.
>
> > +gc.auto=0::
> > + This disables automatic garbage collection, since Scalar uses background
> > + maintenance to keep the repository data in good shape.
>
> Checking my understanding: this means there will be _no_ automatic GC in a
> scalar repo? Since scalar calls 'maintenance register' which means
> maintenance.strategy will be set to 'incremental' which won't schedule any gc
> runs
Yes, auto-garbage-collection is completely disabled in repositories
managed by Scalar. And I guess that made sense in the past:
auto-maintenance did not know about maintenance strategies at all, and
consequently it would still run git-gc(1). And that's not really
compatible with the "incremental" strategy that Scalar wants to use.
I changed that in Git 2.52 so that maintenance strategies now apply to
both scheduled and normal maintenance. But I was worried about backwards
compatibility for the "incremental" strategy, so I made the change in a
backwards compatible way so that normal maintenance still ends up using
git-gc(1).
Arguably though, we can now iterate on our infrastructure: if we were to
introduce an "incremental-v2" strategy we could adapt it to have proper
strategies for both scheduled and normal maintenance. And if so, we can
adapt Scalar in such a way that it doesn't have to disable auto
maintenance anymore.
I think that would be a reasonable thing to do. Scheduled maintenance
only runs once per hour, and in a high-activity repo a user may easily
generate tons of objects in that hour that make the repository perform
badly.
Patrick |
||
| local file system, unregistering the repository. | ||
|
|
||
| REQUIRED AND RECOMMENDED CONFIG | ||
| ------------------------------- | ||
|
|
||
| As part of both `scalar clone` and `scalar register`, certain Git config | ||
| values are set to optimize for large repositories or cross-platform support. | ||
| These options are updated in new Git versions according to the best known | ||
| advice for large repositories, and users can get the latest recommendations | ||
| by running `scalar reconfigure [--all]`. | ||
|
|
||
| This section lists justifications for the config values that are set in the | ||
| latest version. | ||
|
|
||
| am.keepCR=true:: | ||
| This setting is important for cross-platform development across Windows | ||
| and non-Windows platforms and keeping carriage return (`\r`) characters | ||
| in certain workflows. | ||
|
|
||
| commitGraph.changedPaths=true:: | ||
| This setting helps the background maintenance steps that compute the | ||
| serialized commit-graph to also store changed-path Bloom filters. This | ||
| accelerates file history commands and allows users to automatically | ||
| benefit without running a foreground command. | ||
|
|
||
| commitGraph.generationVersion=1:: | ||
| While the preferred version is 2 for performance reasons, existing users | ||
| that had version 1 by default will need special care in upgrading to | ||
| version 2. This is likely to change in the future as the upgrade story | ||
| solidifies. | ||
|
|
||
| core.autoCRLF=false:: | ||
| This removes the transformation of worktree files to add CRLF line | ||
| endings when only LF line endings exist. This is removed for performance | ||
| reasons. Repositories that use tools that care about CRLF line endings | ||
| should commit the necessary files with those line endings instead. | ||
|
|
||
| core.logAllRefUpdates=true:: | ||
| This enables the reflog on all branches. While this is a performance | ||
| cost for large repositories, it is frequently an important data source | ||
| for users to get out of bad situations or to seek support from experts. | ||
|
|
||
| core.safeCRLF=false:: | ||
| Similar to `core.autoCRLF=false`, this disables checks around whether | ||
| the CRLF conversion is reversible. This is a performance improvement, | ||
| but can be dangerous if `core.autoCRLF` is reenabled by the user. | ||
|
|
||
| credential.https://dev.azure.com.useHttpPath=true:: | ||
| This setting enables the `credential.useHttpPath` feature only for web | ||
| URLs for Azure DevOps. This is important for users interacting with that | ||
| service using multiple organizations and thus multiple credential | ||
| tokens. | ||
|
|
||
| feature.experimental=false:: | ||
| This disables the "experimental" optimizations grouped under this | ||
| feature config. The expectation is that all valuable optimizations are | ||
| also set explicitly by Scalar config, and any differences are | ||
| intentional. Notable differences include several bitmap-related config | ||
| options which are disabled for client-focused Scalar repos. | ||
|
|
||
| feature.manyFiles=false:: | ||
| This disables the "many files" optimizations grouped under this feature | ||
| config. The expectation is that all valuable optimizations are also set | ||
| explicitly by Scalar config, and any differences are intentional. | ||
|
|
||
| fetch.showForcedUpdates=false:: | ||
| This disables the check at the end of `git fetch` that notifies the user | ||
| if the ref update was a forced update (one where the previous position | ||
| is not reachable from the latest position). This check can be very | ||
| expensive in large repositories, so is disabled and replaced with an | ||
| advice message. Set `advice.fetchShowForcedUpdates=false` to disable | ||
| this advice message. | ||
|
|
||
| fetch.unpackLimit=1:: | ||
| This setting prevents Git from unpacking packfiles into loose objects | ||
| as they are downloaded from the server. The default limit of 100 was | ||
| intended as a way to prevent performance issues from too many packfiles, | ||
| but Scalar uses background maintenance to group packfiles and cover them | ||
| with a multi-pack-index, removing this issue. | ||
|
|
||
| fetch.writeCommitGraph=false:: | ||
| This config setting was created to help users automatically update their | ||
| commit-graph files as they perform fetches. However, this takes time | ||
| from foreground fetches and pulls and Scalar uses background maintenance | ||
| for this function instead. | ||
|
|
||
| gc.auto=0:: | ||
| This disables automatic garbage collection, since Scalar uses background | ||
| maintenance to keep the repository data in good shape. | ||
|
|
||
| gui.GCWarning=false:: | ||
| Since Scalar disables garbage collection by setting `gc.auto=0`, the | ||
| `git-gui` tool may start to warn about this setting. Disable this | ||
| warning as Scalar's background maintenance configuration makes the | ||
| warning irrelevant. | ||
|
|
||
| index.skipHash=true:: | ||
| Disable computing the hash of the index contents as it is being written. | ||
| This assists with performance, especially for large index files. | ||
|
|
||
| index.threads=true:: | ||
| This tells Git to automatically detect how many threads it should use | ||
| when reading the index due to the default value of `core.preloadIndex`, | ||
| which enables parallel index reads. | ||
|
|
||
| index.version=4:: | ||
| This index version adds compression to the path names, reducing the size | ||
| of the index in a significant way for large repos. This is an important | ||
| performance boost. | ||
|
|
||
| merge.renames=true:: | ||
| When computing merges in large repos, it is particularly important to | ||
| detect renames to maximize the potential for a result that will validate | ||
| correctly. Users performing merges locally are more likely to be doing | ||
| so because a server-side merge (via pull request or similar) resulted in | ||
| conflicts. While this is the default setting, it is set specifically to | ||
| override a potential change to `diff.renames` which a user may set for | ||
| performance reasons. | ||
|
|
||
| merge.stat=false:: | ||
| This disables a diff output after computing a merge. This improves | ||
| performance of `git merge` for large repos while reducing noisy output. | ||
|
|
||
| pack.useBitmaps=false:: | ||
| This disables the use of `.bitmap` files attached to packfiles. Bitmap | ||
| files are optimized for server-side use, not client-side use. Scalar | ||
| disables this to avoid some performance issues that can occur if a user | ||
| accidentally creates `.bitmap` files. | ||
|
|
||
| pack.usePathWalk=true:: | ||
| This enables the `--path-walk` option to `git pack-objects` by default. | ||
| This can accelerate the computation and compression of packfiles created | ||
| by `git push` and other repack operations. | ||
|
|
||
| receive.autoGC=false:: | ||
| Similar to `gc.auto`, this setting is disabled in preference of | ||
| background maintenance. | ||
|
|
||
| status.aheadBehind=false:: | ||
| This disables the ahead/behind calculation that would normally happen | ||
| during a `git status` command. This information is frequently ignored by | ||
| users but can be expensive to calculate in large repos that receive | ||
| thousands of commits per day. The calculation is replaced with an advice | ||
| message that can be disabled by disabling the `advice.statusAheadBehind` | ||
| config. | ||
|
|
||
| The following settings are different based on which platform is in use: | ||
|
|
||
| core.untrackedCache=(true|false):: | ||
| The untracked cache feature is important for performance benefits on | ||
| large repositories, but has demonstrated some bugs on Windows | ||
| filesystems. Thus, this is set for other platforms but disabled on | ||
| Windows. | ||
|
|
||
| http.sslBackend=schannel:: | ||
| On Windows, the `openssl` backend has some issues with certain types of | ||
| remote providers and certificate types. Override the default setting to | ||
| avoid these common problems. | ||
|
|
||
|
|
||
| SEE ALSO | ||
| -------- | ||
| linkgit:git-clone[1], linkgit:git-maintenance[1]. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -19,6 +19,7 @@ | |
| #include "help.h" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> Add "# set by scalar" to the end of each config option to assist users
> in identifying why these config options were set in their repo.
The implementation is quite straight-forward, inlining expansion of
repo_config_set_gently() in the places that we want to add comment to.
If we had (a lot) more than two callsites, I would have suggested to
add a simple helper function, something like
static int scalar_config_set(struct repository *r, const char *key, const char *value)
{
char *file = repo_git_path(r, "config");
int res = repo_config_set_multivar_in_file_gently(r, file,
key, value, NULL, " # set by scalar", 0);
free(file);
return res;
}
and then the updates to the callers would have been absolute minimum.
Well, even with only two callsites, perhaps such a refactoring may
still have value in reducing the risk of typo in the comment.
> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
> GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
> test_path_is_file one/src/cron.txt &&
> test true = "$(git -C one/src config core.preloadIndex)" &&
> + test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> + test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
> test_subcommand git maintenance start <reconfigure &&
> test_subcommand ! git maintenance unregister --force <reconfigure &&
Looks good.There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Nov 26, 2025 at 03:55:10PM -0800, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>
> > Add "# set by scalar" to the end of each config option to assist users
> > in identifying why these config options were set in their repo.
>
> The implementation is quite straight-forward, inlining expansion of
> repo_config_set_gently() in the places that we want to add comment to.
>
> If we had (a lot) more than two callsites, I would have suggested to
> add a simple helper function, something like
>
> static int scalar_config_set(struct repository *r, const char *key, const char *value)
> {
> char *file = repo_git_path(r, "config");
> int res = repo_config_set_multivar_in_file_gently(r, file,
> key, value, NULL, " # set by scalar", 0);
> free(file);
> return res;
> }
>
> and then the updates to the callers would have been absolute minimum.
>
> Well, even with only two callsites, perhaps such a refactoring may
> still have value in reducing the risk of typo in the comment.
Agreed, I think it's a good idea to provide such a function. The calls
to `repo_config_set_multivar_in_file_gently()` are quite verbose.
PatrickThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Dec 01, 2025 at 04:50:43PM +0000, Derrick Stolee via GitGitGadget wrote:
> diff --git a/t/t9210-scalar.sh b/t/t9210-scalar.sh
> index bd6f0c40d2..43c210a23d 100755
> --- a/t/t9210-scalar.sh
> +++ b/t/t9210-scalar.sh
> @@ -210,6 +210,9 @@ test_expect_success 'scalar reconfigure' '
> GIT_TRACE2_EVENT="$(pwd)/reconfigure" scalar reconfigure -a &&
> test_path_is_file one/src/cron.txt &&
> test true = "$(git -C one/src config core.preloadIndex)" &&
> + test_grep "preloadIndex = true # set by scalar" one/src/.git/config &&
> + test_grep "excludeDecoration = refs/prefetch/\* # set by scalar" one/src/.git/config &&
> +
> test_subcommand git maintenance start <reconfigure &&
> test_subcommand ! git maintenance unregister --force <reconfigure &&
We _could_ make this a bit more solid by adding a test that:
1. Initializes a new repository.
2. Saves the configuration.
3. Performs `scalar reconfigure`.
4. Asserts that all new non-section-header lines in the configuration
have a trailing "#set by scalar" comment.
This would ensure that there is no callsite we forgot to add the new
annotation to, and that there are new future callsites where somebody
isn't aware of the comments.
I don't insist on such a test though, so please feel free to ignore this
suggestion.
Patrick |
||
| #include "setup.h" | ||
| #include "trace2.h" | ||
| #include "path.h" | ||
|
|
||
| static void setup_enlistment_directory(int argc, const char **argv, | ||
| const char * const *usagestr, | ||
|
|
@@ -95,6 +96,16 @@ struct scalar_config { | |
| int overwrite_on_reconfigure; | ||
| }; | ||
|
|
||
| static int set_config_with_comment(const char *key, const char *value) | ||
| { | ||
| char *file = repo_git_path(the_repository, "config"); | ||
| int res = repo_config_set_multivar_in_file_gently(the_repository, file, | ||
| key, value, NULL, | ||
| " # set by scalar", 0); | ||
| free(file); | ||
| return res; | ||
| } | ||
|
|
||
| static int set_scalar_config(const struct scalar_config *config, int reconfigure) | ||
| { | ||
| char *value = NULL; | ||
|
|
@@ -103,7 +114,7 @@ static int set_scalar_config(const struct scalar_config *config, int reconfigure | |
| if ((reconfigure && config->overwrite_on_reconfigure) || | ||
| repo_config_get_string(the_repository, config->key, &value)) { | ||
| trace2_data_string("scalar", the_repository, config->key, "created"); | ||
| res = repo_config_set_gently(the_repository, config->key, config->value); | ||
| res = set_config_with_comment(config->key, config->value); | ||
| } else { | ||
| trace2_data_string("scalar", the_repository, config->key, "exists"); | ||
| res = 0; | ||
|
|
@@ -122,13 +133,33 @@ static int have_fsmonitor_support(void) | |
| static int set_recommended_config(int reconfigure) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <[email protected]>
>
> The config values set by Scalar went through an audit in the previous
> changes, so now reorganize the settings and simplify their purpose.
>
> First, alphabetize the config options, except put the platform-specific
> options at the end. This groups two Windows-specific settings and only
> one non-Windows setting.
>
> Also, this removes the 'overwrite_on_reconfigure' setting for many of
> these options. That setting made nearly all of these options "required"
> for scalar enlistments, restricting use for users. Instead, now nearly
> all options have removed this setting.
As far as I understand, this setting causes us to overwrite any
preexisting config values when reconfiguring Scalar? So with your
changes the effect is that we now don't do that anymore, which allows
the user to tune some of the configuration values to their liking after
having run `scalar init` for the first time. I guess that makes sense,
as it gives the user more flexibility.
It does make me wonder though: is it really the most sensible thing to
overwrite any keys that already exist in the configuration? We may end
up overwriting configuration specified by the user both in the case of
`scalar init` and `scalar reconfigure`. But arguably, we might want to
only ever write configuration that does _not_ yet have an explicit value
in the configuration file, regardless of whether or not we reconfigure.
> However, there is one setting that still has this, which is
> index.skipHash, which was previously being set to _false_ when we
> actually prefer the value of true. Keep the overwrite here to help
> Scalar users upgrade to the new version. We may remove that overwrite in
> the future once we belive that most of the users who have the false
> value have upgraded to a version that overwrites that to 'true'.
Makes sense. This has likely been a bug, and we now want to rectify that
bug.
Thanks!
PatrickThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Derrick Stolee wrote (reply to this): On 12/1/25 3:55 AM, Patrick Steinhardt wrote:
> On Wed, Nov 26, 2025 at 10:18:35PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <[email protected]>
>>
>> The config values set by Scalar went through an audit in the previous
>> changes, so now reorganize the settings and simplify their purpose.
>>
>> First, alphabetize the config options, except put the platform-specific
>> options at the end. This groups two Windows-specific settings and only
>> one non-Windows setting.
>>
>> Also, this removes the 'overwrite_on_reconfigure' setting for many of
>> these options. That setting made nearly all of these options "required"
>> for scalar enlistments, restricting use for users. Instead, now nearly
>> all options have removed this setting.
> > As far as I understand, this setting causes us to overwrite any
> preexisting config values when reconfiguring Scalar? So with your
> changes the effect is that we now don't do that anymore, which allows
> the user to tune some of the configuration values to their liking after
> having run `scalar init` for the first time. I guess that makes sense,
> as it gives the user more flexibility.
Yes, that is correct.
> It does make me wonder though: is it really the most sensible thing to
> overwrite any keys that already exist in the configuration? We may end
> up overwriting configuration specified by the user both in the case of
> `scalar init` and `scalar reconfigure`. But arguably, we might want to
> only ever write configuration that does _not_ yet have an explicit value
> in the configuration file, regardless of whether or not we reconfigure.
I agree that this notion of forcing config is not optimal, and is a leftover
from VFS for Git where some of these config things were actually required
for the virtualization to work. Once that idea was in place, it was easy
to think "we'll make sure the repo is configured correctly" but that makes
much less sense in Scalar these days.
>> However, there is one setting that still has this, which is
>> index.skipHash, which was previously being set to _false_ when we
>> actually prefer the value of true. Keep the overwrite here to help
>> Scalar users upgrade to the new version. We may remove that overwrite in
>> the future once we belive that most of the users who have the false
>> value have upgraded to a version that overwrites that to 'true'.
> > Makes sense. This has likely been a bug, and we now want to rectify that
> bug.
And hopefully this is the only reason we'd need this "overwrite" feature
from this point on.
Thanks,
-Stolee |
||
| { | ||
| struct scalar_config config[] = { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Junio C Hamano wrote (reply to this): "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> From: Derrick Stolee <[email protected]>
>
> These config values were added in the original Scalar contribution,
> d0feac4e8c (scalar: 'register' sets recommended config and starts
> maintenance, 2021-12-03), but were never fully checked for validity in
> the upstream Git project. At the time, Scalar was only intended for the
> contrib/ directory so did not have as rigorous of an investigation.
>
> Each config option has its own justification for removal:
>
> * core.preloadIndex: This value is true by default, now. Removing this
> causes some changes required to the tests that checked this config
> value. Use gui.gcwarning=false instead.
>
> * core.fscache: This config does not exist in the core Git project, but
> is instead a config option for a Git for Windows feature.
>
> * core.multiPackIndex: This config value is now enabled by default, so
> does not need to be called out specifically. It was originally
> included to make sure the background maintenance that created
> multi-pack-indexes would result in the expected performance
> improvements.
>
> * credential.validate: This option is not something specific to Git but
> instead an older version of Git Credential Manager for Windows. That
> software was replaced several years ago by the cross-platform Git
> Credential Manger so this option is no longer needed to help users who
> were on that older software.
>
> * pack.useSparse=true: This value is now Git's default as of de3a864114
> (config: set pack.useSparse=true by default, 2020-03-20) so we don't
> need it set by Scalar.
Thanks for a conprehensive list. Very well described.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Matthew Hughes wrote (reply to this): On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> * core.preloadIndex: This value is true by default, now. Removing this
> causes some changes required to the tests that checked this config
> value. Use gui.gcwarning=false instead.
I was going to ask about if we could also rely on the default value of
index.threads like we do here, but then went and did some reading and realised
some config values, like index.recordOffsetTable, have their value set
according to whether index.threads was explicitly set, so I guess there's an
implicit reliance on that behaviour that we want to keep?
> * core.fscache: This config does not exist in the core Git project, but
> is instead a config option for a Git for Windows feature.
>
> * core.multiPackIndex: This config value is now enabled by default, so
> does not need to be called out specifically. It was originally
> included to make sure the background maintenance that created
> multi-pack-indexes would result in the expected performance
> improvements.
>
> * credential.validate: This option is not something specific to Git but
> instead an older version of Git Credential Manager for Windows. That
> software was replaced several years ago by the cross-platform Git
> Credential Manger so this option is no longer needed to help users who
> were on that older software.
>
> * pack.useSparse=true: This value is now Git's default as of de3a864114
> (config: set pack.useSparse=true by default, 2020-03-20) so we don't
> need it set by Scalar.
Thanks for the detail on all of these, very helpfulThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Dec 01, 2025 at 05:46:46PM +0000, Matthew Hughes wrote:
> On Mon, Dec 01, 2025 at 04:50:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> > * core.preloadIndex: This value is true by default, now. Removing this
> > causes some changes required to the tests that checked this config
> > value. Use gui.gcwarning=false instead.
>
> I was going to ask about if we could also rely on the default value of
> index.threads like we do here, but then went and did some reading and realised
> some config values, like index.recordOffsetTable, have their value set
> according to whether index.threads was explicitly set, so I guess there's an
> implicit reliance on that behaviour that we want to keep?
Wait. Are you saying that "index.recordOffsetTable" behaves differently
based on whether "index.threads" is implicitly enabled due to the
default value or explicitly enabled via the configuration? If so, that
smells like a plain bug to me.
PatrickThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Matthew Hughes wrote (reply to this): On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> Wait. Are you saying that "index.recordOffsetTable" behaves differently
> based on whether "index.threads" is implicitly enabled due to the
> default value or explicitly enabled via the configuration?
That was my understanding from a cursory read of the results of searching for
'index.threads' in git-config:
> index.recordEndOfIndexEntries
> ...
> Defaults to true if index.threads has been explicitly enabled, false
> otherwiseThere was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Tue, Dec 02, 2025 at 07:04:24PM +0000, Matthew Hughes wrote:
> On Tue, Dec 02, 2025 at 08:53:45AM +0100, Patrick Steinhardt wrote:
> > Wait. Are you saying that "index.recordOffsetTable" behaves differently
> > based on whether "index.threads" is implicitly enabled due to the
> > default value or explicitly enabled via the configuration?
>
> That was my understanding from a cursory read of the results of searching for
> 'index.threads' in git-config:
>
> > index.recordEndOfIndexEntries
> > ...
> > Defaults to true if index.threads has been explicitly enabled, false
> > otherwise
Hm, true. At least that's a concious decision then.
The logic around this was introduced in 2a9dedef2e (index: make
index.threads=true enable ieot and eoie, 2018-11-19), and the ultimate
reason for it seems to be backwards compatibility:
index.threads and index.recordOffsetTable unspecified: do not write
the offset table yet (to avoid alarming the user with "ignoring IEOT
extension" messages when an older version of Git accesses the
repository) but do make use of multiple threads to read the index if
the supporting offset table is present.
Older versions of Git complained when they see unknown extensions, and
we didn't want to expose users to such warnings. That makes me wonder
whether it's time now to revisit that decision -- it's been 7 years
since then, I guess that many clients nowadays would understand the
extension.
The only (documented) downside should thus not be that important
anymore, but the upside is that reading the index would be faster if we
default-enable writing the extension.
Patrick |
||
| /* Required */ | ||
| { "am.keepCR", "true", 1 }, | ||
| { "core.FSCache", "true", 1 }, | ||
| { "core.multiPackIndex", "true", 1 }, | ||
| { "core.preloadIndex", "true", 1 }, | ||
| { "am.keepCR", "true" }, | ||
| { "commitGraph.changedPaths", "true" }, | ||
| { "commitGraph.generationVersion", "1" }, | ||
| { "core.autoCRLF", "false" }, | ||
| { "core.logAllRefUpdates", "true" }, | ||
| { "core.safeCRLF", "false" }, | ||
| { "credential.https://dev.azure.com.useHttpPath", "true" }, | ||
| { "feature.experimental", "false" }, | ||
| { "feature.manyFiles", "false" }, | ||
| { "fetch.showForcedUpdates", "false" }, | ||
| { "fetch.unpackLimit", "1" }, | ||
| { "fetch.writeCommitGraph", "false" }, | ||
| { "gc.auto", "0" }, | ||
| { "gui.GCWarning", "false" }, | ||
| { "index.skipHash", "true", 1 /* Fix previous setting. */ }, | ||
| { "index.threads", "true"}, | ||
| { "index.version", "4" }, | ||
| { "merge.renames", "true" }, | ||
| { "merge.stat", "false" }, | ||
| { "pack.useBitmaps", "false" }, | ||
| { "pack.usePathWalk", "true" }, | ||
| { "receive.autoGC", "false" }, | ||
| { "status.aheadBehind", "false" }, | ||
|
|
||
| /* platform-specific */ | ||
| #ifndef WIN32 | ||
| { "core.untrackedCache", "true", 1 }, | ||
| { "core.untrackedCache", "true" }, | ||
| #else | ||
| /* | ||
| * Unfortunately, Scalar's Functional Tests demonstrated | ||
|
|
@@ -142,36 +173,11 @@ static int set_recommended_config(int reconfigure) | |
| * Therefore, with a sad heart, we disable this very useful | ||
| * feature on Windows. | ||
| */ | ||
| { "core.untrackedCache", "false", 1 }, | ||
| #endif | ||
| { "core.logAllRefUpdates", "true", 1 }, | ||
| { "credential.https://dev.azure.com.useHttpPath", "true", 1 }, | ||
| { "credential.validate", "false", 1 }, /* GCM4W-only */ | ||
| { "gc.auto", "0", 1 }, | ||
| { "gui.GCWarning", "false", 1 }, | ||
| { "index.skipHash", "false", 1 }, | ||
| { "index.threads", "true", 1 }, | ||
| { "index.version", "4", 1 }, | ||
| { "merge.stat", "false", 1 }, | ||
| { "merge.renames", "true", 1 }, | ||
| { "pack.useBitmaps", "false", 1 }, | ||
| { "pack.useSparse", "true", 1 }, | ||
| { "receive.autoGC", "false", 1 }, | ||
| { "feature.manyFiles", "false", 1 }, | ||
| { "feature.experimental", "false", 1 }, | ||
| { "fetch.unpackLimit", "1", 1 }, | ||
| { "fetch.writeCommitGraph", "false", 1 }, | ||
| #ifdef WIN32 | ||
| { "http.sslBackend", "schannel", 1 }, | ||
| { "core.untrackedCache", "false" }, | ||
|
|
||
| /* Other Windows-specific required settings: */ | ||
| { "http.sslBackend", "schannel" }, | ||
| #endif | ||
| /* Optional */ | ||
| { "status.aheadBehind", "false" }, | ||
| { "commitGraph.changedPaths", "true" }, | ||
| { "commitGraph.generationVersion", "1" }, | ||
| { "core.autoCRLF", "false" }, | ||
| { "core.safeCRLF", "false" }, | ||
| { "fetch.showForcedUpdates", "false" }, | ||
| { "pack.usePathWalk", "true" }, | ||
| { NULL, NULL }, | ||
| }; | ||
| int i; | ||
|
|
@@ -197,9 +203,8 @@ static int set_recommended_config(int reconfigure) | |
| if (repo_config_get_string(the_repository, "log.excludeDecoration", &value)) { | ||
| trace2_data_string("scalar", the_repository, | ||
| "log.excludeDecoration", "created"); | ||
| if (repo_config_set_multivar_gently(the_repository, "log.excludeDecoration", | ||
| "refs/prefetch/*", | ||
| CONFIG_REGEX_NONE, 0)) | ||
| if (set_config_with_comment("log.excludeDecoration", | ||
| "refs/prefetch/*")) | ||
| return error(_("could not configure " | ||
| "log.excludeDecoration")); | ||
| } else { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):