Skip to content

Commit 9166877

Browse files
Add a script to automatically format C++/Python files upon git push (#9293)
Co-authored-by: Zhanyong Wan <[email protected]>
1 parent 002f27a commit 9166877

File tree

2 files changed

+256
-26
lines changed

2 files changed

+256
-26
lines changed

CONTRIBUTING.md

Lines changed: 43 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -212,25 +212,25 @@ first time, you may need to build everything again, for example, after a
212212

213213
* Clone the _PyTorch_ repo as per [instructions](https://github.com/pytorch/pytorch#from-source).
214214

215-
```Shell
215+
```shell
216216
git clone --recursive https://github.com/pytorch/pytorch
217217
cd pytorch/
218218
```
219219

220220
* Clone the _PyTorch/XLA_ repo:
221221

222-
```Shell
222+
```shell
223223
git clone --recursive https://github.com/pytorch/xla.git
224224
```
225225

226226
* Build PyTorch
227-
```Shell
227+
```shell
228228
# pytorch/xla requires pytorch wheel to be presented under pytorch/dist
229229
python setup.py bdist_wheel
230230
python setup.py develop
231231
```
232232
* Build PyTorch/XLA
233-
```Shell
233+
```shell
234234
cd xla/
235235
python setup.py develop
236236
```
@@ -241,30 +241,35 @@ Please refer to this [guide](https://github.com/pytorch/xla/blob/master/plugins/
241241

242242
## Before Creating a Pull Request
243243

244-
In `pytorch/xla` repo we enforce coding style for both C++ and Python files. Please try to format
245-
your code before creating a pull request.
244+
In `pytorch/xla` repo we enforce coding style for both C++ and Python files.
245+
Specifically, we use `clang-format-11` with a customized style config to format
246+
C++, and `yapf` (specially version 0.40.2) with a customized style config
247+
to format Python. Please ensure that your change is formatted properly before
248+
sending out a PR.
246249

247-
### C++ Style Guide
250+
The easiest way to do this is to set up a `git push` hook to automatically
251+
format changed or added C++ and Python files before pushing:
248252

249-
`pytorch/xla` uses `clang-format-11` with a customized style config.
250-
If your PR touches the C++ source files, please run the following command before submitting a PR.
251-
252-
```Shell
253-
# How to install: sudo apt install clang-format-11
254-
# If your PR only changes foo.cpp, run the following in xla/ folder
255-
clang-format-11 -i -style=file /PATH/TO/foo.cpp
256-
# To format all cpp files, run the following in xla/ folder
257-
find -name '*.cpp' -o -name '*.h' -o -name '*.cc' | xargs clang-format-11 -i -style=file
253+
First, install the necessary tools if needed:
254+
```shell
255+
cd $WORKSPACE_DIR/pytorch/xla
256+
# If clang-format-11 is not yet installed...
257+
sudo apt install clang-format-11
258+
# If yapf 0.40.2 is not yet installed...
259+
pip install yapf==0.40.2
258260
```
259261

260-
### Python Style Guide
262+
Then, set up the git push hook:
263+
```shell
264+
scripts/git_fix.py --set_git_push_hook
265+
```
261266

262-
`pytorch/xla` uses `yapf`(specially version 0.40.2 in case it's not backward compatible) with a customized style config.
263-
If your PR touches the Python source files, please run the following command before submitting a PR.
267+
Now, whenever you run `git push`, the C++ and Python files will be
268+
automatically formatted according to our style guide.
264269

265-
```Shell
266-
# How to install: pip install yapf==0.40.2
267-
yapf --recursive -i *.py test/ scripts/ torch_xla/ benchmarks/ torchax/
270+
You can also format the files manually by running
271+
```shell
272+
scripts/git_fix.py
268273
```
269274

270275
### Running the Tests
@@ -273,19 +278,19 @@ To run the tests, follow __one__ of the options below:
273278

274279
* Run on local CPU:
275280

276-
```Shell
281+
```shell
277282
export PJRT_DEVICE=CPU
278283
```
279284

280285
* Run on Cloud TPU:
281286

282-
```Shell
287+
```shell
283288
export PJRT_DEVICE=TPU
284289
```
285290

286291
* Run on GPU:
287292

288-
```Shell
293+
```shell
289294
export PJRT_DEVICE=CUDA GPU_NUM_DEVICES=${NUM_GPU}
290295
```
291296

@@ -419,4 +424,16 @@ git rebase upstream/master
419424
# When you are done, push the updated branch to your fork on GitHub. This will
420425
# update the PR.
421426
git push --force-with-lease origin your-branch-name
422-
```
427+
```
428+
429+
## Fixing Git Metadata File Permissions
430+
431+
Normally we run `git` commands outside of the dev container. If we run
432+
a mutating `git` command inside the dev container, it may change the owner
433+
of some files inside the `.git` directory to `root`, which will prevent us from
434+
running `git` commands outside of the dev container. To fix this, run the
435+
following commands *outside of the dev container* to fix the file owners:
436+
437+
```shell
438+
sudo chown -R $USER .git
439+
```

scripts/git_fix.py

Lines changed: 213 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,213 @@
1+
#!/usr/bin/env python3
2+
"""Fixes the format of locally changed git files.
3+
"""
4+
5+
import argparse
6+
import os
7+
import re
8+
import sys
9+
10+
# Root of the PyTorch/XLA repo.
11+
_PTXLA_DIR = os.path.abspath(os.path.dirname(__file__) + '/..')
12+
13+
# Path of this script, relative to the repo root.
14+
_SCRIPT_PATH = os.path.abspath(__file__)[len(_PTXLA_DIR) + 1:]
15+
16+
# Names of CLIs for formatting files.
17+
_CLANG_FORMAT = 'clang-format-11'
18+
_YAPF = 'yapf'
19+
20+
# Path of the git pre-push hook script, relative to the repo root.
21+
_GIT_PRE_PUSH_HOOK_PATH = '.git/hooks/pre-push'
22+
23+
24+
def get_uncommitted_changed_added_files() -> set[str]:
25+
"""Gets the list of changed or added files that are not committed yet.
26+
27+
This includes untracked files (i.e. newly created files not added to git
28+
yet).
29+
"""
30+
31+
# Each line is a filepath prefixed by its status.
32+
lines = os.popen('git status --porcelain').read().strip().split('\n')
33+
files = set()
34+
for line in lines:
35+
# Find changed (M), added (A), or untracked (??) files.
36+
if re.match(r'^\s*(M|A|\?\?)\s', line):
37+
# The last field of the line is the filepath.
38+
files.add(line.split()[-1])
39+
return files
40+
41+
42+
def get_committed_changed_added_files() -> set[str]:
43+
"""Gets the list of changed or added files that are committed locally.
44+
45+
These are the files that are committed in the local branch but not jyet
46+
merged to the origin/master branch.
47+
"""
48+
49+
# Each line is a filepath.
50+
lines = os.popen(
51+
# --name-only : include filepaths only in the output.
52+
# --diff-filter=AM : include only added (A) or modified (M) files.
53+
# --no-renames : if a file is renamed, treat it as a deleted file and
54+
# an added file, so that the added file will be included in the
55+
# result.
56+
# origin/master...HEAD : compare the local branch HEAD with
57+
# origin/master, showing changes that exist on your local branch
58+
# but not on origin/master.
59+
'git diff --name-only --diff-filter=AM --no-renames origin/master...HEAD'
60+
).read().strip().split('\n')
61+
return set(lines)
62+
63+
64+
def get_cplusplus_files(files: set[str]) -> set[str]:
65+
"""Filters the given list of files and returns the C++ files."""
66+
67+
return {
68+
f for f in files
69+
if os.path.splitext(f)[1] in ('.cc', '.h', '.cpp', '.cxx')
70+
}
71+
72+
73+
def get_python_files(files: set[str]) -> set[str]:
74+
"""Filters the given list of files and returns the Python files."""
75+
76+
return {f for f in files if os.path.splitext(f)[1] == '.py'}
77+
78+
79+
def tool_is_installed(file_type: str, tool: str) -> bool:
80+
"""Checks if the given tool is installed.
81+
82+
Args:
83+
file_type: The type of the file (e.g. C++, python).
84+
tool: The name of the tool.
85+
86+
Returns:
87+
True if the tool is already installed.
88+
"""
89+
90+
# Is the tool already installed?
91+
if os.system(f'which {tool} > /dev/null') == 0:
92+
return True
93+
94+
print(
95+
f'WARNING: {tool} is not installed. It\'s needed for formatting '
96+
f'{file_type} files. Please install it and retry.',
97+
file=sys.stderr)
98+
return False
99+
100+
101+
def format_files(file_type: str, tool: str, format_command: str,
102+
files: set[str]) -> bool:
103+
"""Fixes the formatting of the given files.
104+
105+
Args:
106+
file_type: The type of the file (e.g. C++, python).
107+
tool: The name of the tool.
108+
format_command: The command to format the files.
109+
files: The files to format.
110+
Returns:
111+
True if the formatting was successful.
112+
"""
113+
114+
if not files:
115+
return True
116+
117+
if not tool_is_installed(file_type, tool):
118+
return False
119+
120+
command = f'{format_command} {" ".join(files)}'
121+
if os.system(command) == 0:
122+
print(
123+
f'Successfully formatted {file_type} files:\n' +
124+
'\n'.join(sorted(' ' + f for f in files)),
125+
file=sys.stderr)
126+
return True
127+
128+
print(
129+
f'WARNING: Failed to format {file_type} files via command: {command}',
130+
file=sys.stderr)
131+
return False
132+
133+
134+
def format_cplusplus_files(files: set[str]) -> bool:
135+
"""Fixes the formatting of the given C++ files.
136+
137+
Returns:
138+
True if the formatting was successful.
139+
"""
140+
141+
return format_files('C++', _CLANG_FORMAT, f'{_CLANG_FORMAT} -i -style=file',
142+
files)
143+
144+
145+
def format_python_files(files: set[str]) -> bool:
146+
"""Fixes the formatting of the given Python files.
147+
148+
Returns:
149+
True if the formatting was successful.
150+
"""
151+
152+
return format_files('python', _YAPF, f'{_YAPF} -i', files)
153+
154+
155+
def set_git_push_hook() -> bool:
156+
"""Sets up `git push` to automatially run this script before pushing.
157+
158+
Returns:
159+
True if the set-up was successful.
160+
"""
161+
162+
if os.path.isfile(_GIT_PRE_PUSH_HOOK_PATH):
163+
print(
164+
f'git pre-push hook already exists at {_GIT_PRE_PUSH_HOOK_PATH}.',
165+
file=sys.stderr)
166+
# Ask the user if they want to overwrite it.
167+
overwrite = input(
168+
f'Are you sure you want to overwrite {_GIT_PRE_PUSH_HOOK_PATH}? [y/N] ')
169+
if overwrite.lower() != 'y':
170+
print('Skipping git pre-push hook setup.', file=sys.stderr)
171+
return False
172+
173+
# Create a git pre-push hook to run this script.
174+
with open(_GIT_PRE_PUSH_HOOK_PATH, 'w') as f:
175+
f.write(f'''#!/bin/bash
176+
# This hook is automatically set by `{_SCRIPT_PATH } --set_git_push_hook`.)
177+
# We ignore any errors, as file formatting is best effort.
178+
{_SCRIPT_PATH} || true
179+
''')
180+
181+
# Make the hook executable.
182+
os.system(f'chmod +x {_GIT_PRE_PUSH_HOOK_PATH}')
183+
return True
184+
185+
186+
def main() -> None:
187+
arg_parser = argparse.ArgumentParser(prog=_SCRIPT_PATH, description=__doc__)
188+
arg_parser.add_argument(
189+
'--set_git_push_hook',
190+
action='store_true',
191+
help='set up `git push` to automatically run this script before pushing')
192+
args = arg_parser.parse_args()
193+
194+
os.chdir(_PTXLA_DIR)
195+
if args.set_git_push_hook:
196+
sys.exit(0 if set_git_push_hook() else 1)
197+
198+
files = (
199+
get_uncommitted_changed_added_files() |
200+
get_committed_changed_added_files())
201+
202+
# We don't use `format_cplusplus_files(...) and format_python_files(...)` as
203+
# we don't want shortcircuiting.
204+
success = format_cplusplus_files(get_cplusplus_files(files))
205+
if not format_python_files(get_python_files(files)):
206+
success = False
207+
208+
if not success:
209+
sys.exit(1)
210+
211+
212+
if __name__ == '__main__':
213+
main()

0 commit comments

Comments
 (0)