Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply auto code formating rules only on changed files, add transform as 1st user #48347

Closed
wants to merge 3 commits into from

Conversation

hendrikmuhs
Copy link

@hendrikmuhs hendrikmuhs commented Oct 22, 2019

limit auto code formatting to changed files in a PR and enable auto code formatting for the transform plugin

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml/Transform)

@hendrikmuhs hendrikmuhs force-pushed the transform-code-format branch from 1b3d548 to 19ca940 Compare October 22, 2019 19:02
@hendrikmuhs hendrikmuhs added :Core/Infra/Core Core issues without another label discuss labels Oct 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (:Core/Infra/Core)

@hendrikmuhs
Copy link
Author

hendrikmuhs commented Oct 22, 2019

This is a POC for applying auto-formatting only to changed files.

Reasons to limit:

  • tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects
  • rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion
  • backports might get harder if always all files are enforced

IMHO this is more practical and should help adoption.

I added transform as 1st project and applied formatting to 1 file for demonstration (we can split the PR into 2 once we finalize this).

CC @pugnascotia

@hendrikmuhs hendrikmuhs marked this pull request as ready for review October 23, 2019 07:03
@hendrikmuhs hendrikmuhs changed the title [Transform] enable spotless for transform plugin and apply auto code formating rules Apply auto code formating rules only on changed files, add transform as 1st user Oct 23, 2019
@mark-vieira
Copy link
Contributor

mark-vieira commented Oct 23, 2019

tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects

This doesn't completely solve the problem you describe here, it simply defers it until that file is touched. If formatting rules or toolchain implementation changes down the line, the same "full reformat" will happen, it'll just happen the first time that source file is changed.

rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion

If this is the case then certainly it would make sense that we would want that to apply to all source. If we found it important enough to change our ruleset then transitively it would imply that we would not just want that limited to future code changes.

backports might get harder if always all files are enforced

I think again this is mostly negated by the first point. If the code rules change, and I change a file, that would trigger a full reformat of that file. So limiting formatting to changed files I don't think would have any practical effect on making backports easier, since by their nature, a backport would include changed files.

I follow this logic in theory, but I think in practice it's unlikely to be less disruptive and has very awkward implementation implications. This is certainly worthy of a broader discussion, but my gut instinct is to say we either go "all in" on the auto formatting for the projects we apply it to or not at all.

@pugnascotia
Copy link
Contributor

My primary concern with taking the incremental approach is that we'll never achieve a consistent formatting of the code, which is really the point of all this. Having formatted the entire codebase then trawled over the entire diff, I can testify to how diverse the existing code style really is. This exercise was also useful for making a number of tweaks to the current settings in master.

As to the difficult of backporting, true, there may be some impact going 8 -> 6, since we plan to format master, v8 and v7.x, but I don't believe Cloud API or Cloud UI, who both auto-format their code, are experiencing any extra overhead with backports.

@hendrikmuhs
Copy link
Author

tooling changes behavior, for some time we can pin the version but eventually we want to update. This might trigger a full re-format. We faced this problem in other projects

This doesn't completely solve the problem you describe here, it simply defers it until that file is touched. If formatting rules or toolchain implementation changes down the line, the same "full reformat" will happen, it'll just happen the first time that source file is changed.

"full reformat" means changing all files, potentially hundreds of files versus doing a reformat of only the 1 file you change. Mixing formatting changes with logical changes still remains a problem, but it's a big difference if a PR is polluted with hundreds of files vs. just 1 file (note that the friendly developer applies formatting changes upfront in a separate commit, so that reviewers can skip over)

rules change, the set of rules is never perfect, we might want to fine tune it or we change our opinion

If this is the case then certainly it would make sense that we would want that to apply to all source. If we found it important enough to change our ruleset then transitively it would imply that we would not just want that limited to future code changes.

What stops you from applying it to all files or a bigger set of files? Just think about it, a rule change potentially triggers a resubmit of the complete code base eventually.

Limiting formatting to changed files does not imply to forbid applying formatting without logical changes. There is no limitation to future code changes.

Having that said I agree with you in 1 point: There should be a parameter that allows you to apply formatting on all files of 1 project independent of changed files. This could be a gradle parameter. Still, this "do_all" switch should be project based and not for the whole git tree.

backports might get harder if always all files are enforced

I think again this is mostly negated by the first point. If the code rules change, and I change a file, that would trigger a full reformat of that file. So limiting formatting to changed files I don't think would have any practical effect on making backports easier, since by their nature, a backport would include changed files.

Again "full reformat" means all files versus 1 file, the likelihood of a merge conflict increases with the number of files you change.

I did a big re-factoring project in the last 2 months and stumbled upon backports: master contains files that don't exist on 7.x and vice versa. All my re-factoring backports failed to cleanly apply: some could be solved by mergetool but some required manual intervention.

I follow this logic in theory, but I think in practice it's unlikely to be less disruptive and has very awkward implementation implications. This is certainly worthy of a broader discussion, but my gut instinct is to say we either go "all in" on the auto formatting for the projects we apply it to or not at all.

I have done "auto-formating projects" in practice on larger projects twice in the past, I tried with "all" and failed twice. "All or nothing" works for smaller projects but once you reach a certain size the friction is to high.

Apart from that, there is no conflict: "changes-only" might be an intermediate step towards "all". Although I think "all" remains a dream, we say "progress over perfection", and I am looking forward to be disproved about "all" being unrealistic.

@hendrikmuhs
Copy link
Author

hendrikmuhs commented Oct 24, 2019

My primary concern with taking the incremental approach is that we'll never achieve a consistent formatting of the code, which is really the point of all this. Having formatted the entire codebase then trawled over the entire diff, I can testify to how diverse the existing code style really is. This exercise was also useful for making a number of tweaks to the current settings in master.

I see your point, and as suggested in my other answer: It would be good to be able to apply formatting to a whole project and e.g. do "cleanup PR's". What about a parameter that disables my added "changes-only" code? That way you can have the best of both.

In addition to that there could be a Jenkins job that checks formatting on the whole codebase, produces a metric and sends a weekly email. We have a weekly reminder mail for muted tests, we could have the same for "code beauty".

@rjernst
Copy link
Member

rjernst commented Oct 24, 2019

We discussed in FixItFriday and there was strong consensus we should not delay reformatting to when a file is touched. This could make small PRs (say a change to a single line in a single file) much larger and more difficult to review if there have been any tweaks to the formatting rules since the last time that file was touched.

We think any changes to the format should immediately reformat the entire codebase. Doing so ensures we keep a consistent style instead of having a mixture of different styles over time as we do now. The frequency with which we will make tweaks to the format should be low (after this initial period where we are determining what settings are necessary to get a consistent style), so the pain of conflicts from tweaks should be cumulatively small over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants