Apply auto code formating rules only on changed files, add transform as 1st user #48347

hendrikmuhs · 2019-10-22T12:17:02Z

limit auto code formatting to changed files in a PR and enable auto code formatting for the transform plugin

elasticmachine · 2019-10-22T12:17:04Z

Pinging @elastic/ml-core (:ml/Transform)

elasticmachine · 2019-10-22T19:08:50Z

Pinging @elastic/es-core-infra (:Core/Infra/Core)

hendrikmuhs · 2019-10-22T19:16:36Z

This is a POC for applying auto-formatting only to changed files.

Reasons to limit:

tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects
rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion
backports might get harder if always all files are enforced

IMHO this is more practical and should help adoption.

I added transform as 1st project and applied formatting to 1 file for demonstration (we can split the PR into 2 once we finalize this).

CC @pugnascotia

mark-vieira · 2019-10-23T18:21:49Z

tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects

This doesn't completely solve the problem you describe here, it simply defers it until that file is touched. If formatting rules or toolchain implementation changes down the line, the same "full reformat" will happen, it'll just happen the first time that source file is changed.

rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion

If this is the case then certainly it would make sense that we would want that to apply to all source. If we found it important enough to change our ruleset then transitively it would imply that we would not just want that limited to future code changes.

backports might get harder if always all files are enforced

I think again this is mostly negated by the first point. If the code rules change, and I change a file, that would trigger a full reformat of that file. So limiting formatting to changed files I don't think would have any practical effect on making backports easier, since by their nature, a backport would include changed files.

I follow this logic in theory, but I think in practice it's unlikely to be less disruptive and has very awkward implementation implications. This is certainly worthy of a broader discussion, but my gut instinct is to say we either go "all in" on the auto formatting for the projects we apply it to or not at all.

pugnascotia · 2019-10-24T08:27:45Z

My primary concern with taking the incremental approach is that we'll never achieve a consistent formatting of the code, which is really the point of all this. Having formatted the entire codebase then trawled over the entire diff, I can testify to how diverse the existing code style really is. This exercise was also useful for making a number of tweaks to the current settings in master.

As to the difficult of backporting, true, there may be some impact going 8 -> 6, since we plan to format master, v8 and v7.x, but I don't believe Cloud API or Cloud UI, who both auto-format their code, are experiencing any extra overhead with backports.

hendrikmuhs · 2019-10-24T09:07:50Z

tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects

This doesn't completely solve the problem you describe here, it simply defers it until that file is touched. If formatting rules or toolchain implementation changes down the line, the same "full reformat" will happen, it'll just happen the first time that source file is changed.

"full reformat" means changing all files, potentially hundreds of files versus doing a reformat of only the 1 file you change. Mixing formatting changes with logical changes still remains a problem, but it's a big difference if a PR is polluted with hundreds of files vs. just 1 file (note that the friendly developer applies formatting changes upfront in a separate commit, so that reviewers can skip over)

rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion

If this is the case then certainly it would make sense that we would want that to apply to all source. If we found it important enough to change our ruleset then transitively it would imply that we would not just want that limited to future code changes.

What stops you from applying it to all files or a bigger set of files? Just think about it, a rule change potentially triggers a resubmit of the complete code base eventually.

Limiting formatting to changed files does not imply to forbid applying formatting without logical changes. There is no limitation to future code changes.

Having that said I agree with you in 1 point: There should be a parameter that allows you to apply formatting on all files of 1 project independent of changed files. This could be a gradle parameter. Still, this "do_all" switch should be project based and not for the whole git tree.

backports might get harder if always all files are enforced

I think again this is mostly negated by the first point. If the code rules change, and I change a file, that would trigger a full reformat of that file. So limiting formatting to changed files I don't think would have any practical effect on making backports easier, since by their nature, a backport would include changed files.

Again "full reformat" means all files versus 1 file, the likelihood of a merge conflict increases with the number of files you change.

I did a big re-factoring project in the last 2 months and stumbled upon backports: master contains files that don't exist on 7.x and vice versa. All my re-factoring backports failed to cleanly apply: some could be solved by mergetool but some required manual intervention.

I follow this logic in theory, but I think in practice it's unlikely to be less disruptive and has very awkward implementation implications. This is certainly worthy of a broader discussion, but my gut instinct is to say we either go "all in" on the auto formatting for the projects we apply it to or not at all.

I have done "auto-formating projects" in practice on larger projects twice in the past, I tried with "all" and failed twice. "All or nothing" works for smaller projects but once you reach a certain size the friction is to high.

Apart from that, there is no conflict: "changes-only" might be an intermediate step towards "all". Although I think "all" remains a dream, we say "progress over perfection", and I am looking forward to be disproved about "all" being unrealistic.

hendrikmuhs · 2019-10-24T09:14:16Z

My primary concern with taking the incremental approach is that we'll never achieve a consistent formatting of the code, which is really the point of all this. Having formatted the entire codebase then trawled over the entire diff, I can testify to how diverse the existing code style really is. This exercise was also useful for making a number of tweaks to the current settings in master.

I see your point, and as suggested in my other answer: It would be good to be able to apply formatting to a whole project and e.g. do "cleanup PR's". What about a parameter that disables my added "changes-only" code? That way you can have the best of both.

In addition to that there could be a Jenkins job that checks formatting on the whole codebase, produces a metric and sends a weekly email. We have a weekly reminder mail for muted tests, we could have the same for "code beauty".

rjernst · 2019-10-24T20:46:11Z

We discussed in FixItFriday and there was strong consensus we should not delay reformatting to when a file is touched. This could make small PRs (say a change to a single line in a single file) much larger and more difficult to review if there have been any tweaks to the formatting rules since the last time that file was touched.

We think any changes to the format should immediately reformat the entire codebase. Doing so ensures we keep a consistent style instead of having a mixture of different styles over time as we do now. The frequency with which we will make tweaks to the format should be low (after this initial period where we are determining what settings are necessary to get a consistent style), so the pain of conflicts from tweaks should be cumulatively small over time.

hendrikmuhs added >refactoring v8.0.0 :ml/Transform Transform v7.6.0 labels Oct 22, 2019

Hendrik Muhs added 2 commits October 22, 2019 21:01

check only changed files regarding format

3696f4d

format Transform.java

19ca940

hendrikmuhs force-pushed the transform-code-format branch from 1b3d548 to 19ca940 Compare October 22, 2019 19:02

hendrikmuhs added :Core/Infra/Core Core issues without another label discuss labels Oct 22, 2019

improve formating

5de3ca6

hendrikmuhs marked this pull request as ready for review October 23, 2019 07:03

hendrikmuhs changed the title ~~[Transform] enable spotless for transform plugin and apply auto code formating rules~~ Apply auto code formating rules only on changed files, add transform as 1st user Oct 23, 2019

rjernst removed the discuss label Oct 24, 2019

hendrikmuhs closed this Oct 25, 2019

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply auto code formating rules only on changed files, add transform as 1st user #48347

Apply auto code formating rules only on changed files, add transform as 1st user #48347

hendrikmuhs commented Oct 22, 2019 •

edited

Loading

elasticmachine commented Oct 22, 2019

elasticmachine commented Oct 22, 2019

hendrikmuhs commented Oct 22, 2019 •

edited

Loading

mark-vieira commented Oct 23, 2019 •

edited

Loading

pugnascotia commented Oct 24, 2019

hendrikmuhs commented Oct 24, 2019

hendrikmuhs commented Oct 24, 2019 •

edited

Loading

rjernst commented Oct 24, 2019

Apply auto code formating rules only on changed files, add transform as 1st user #48347

Apply auto code formating rules only on changed files, add transform as 1st user #48347

Conversation

hendrikmuhs commented Oct 22, 2019 • edited Loading

elasticmachine commented Oct 22, 2019

elasticmachine commented Oct 22, 2019

hendrikmuhs commented Oct 22, 2019 • edited Loading

mark-vieira commented Oct 23, 2019 • edited Loading

pugnascotia commented Oct 24, 2019

hendrikmuhs commented Oct 24, 2019

hendrikmuhs commented Oct 24, 2019 • edited Loading

rjernst commented Oct 24, 2019

hendrikmuhs commented Oct 22, 2019 •

edited

Loading

hendrikmuhs commented Oct 22, 2019 •

edited

Loading

mark-vieira commented Oct 23, 2019 •

edited

Loading

hendrikmuhs commented Oct 24, 2019 •

edited

Loading