-
Notifications
You must be signed in to change notification settings - Fork 717
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
309 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,309 @@ | ||
# Curated examples from issues | ||
|
||
Lots of people have filed issues against git-filter-repo, and many times it | ||
boils down into questions of "How do I?" or "Why doesn't this work?" | ||
|
||
I thought I'd collect a bunch of these as example repository filterings | ||
that others may be interested in. | ||
|
||
## Table of Contents | ||
|
||
* [Adding files to root commits](#adding-files-to-root-commits) | ||
* [Purge a large list of files](#purge-a-large-list-of-files) | ||
|
||
## Adding files to root commits | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/21 --> | ||
|
||
Here's an example that will take `/path/to/existing/README.md` and | ||
store it as `README.md` in the repository, and take | ||
`/home/myusers/mymodule.gitignore` and store it as `src/.gitignore` in | ||
the repository: | ||
|
||
``` | ||
git filter-repo --commit-callback "if not commit.parents: commit.file_changes += [ | ||
FileChange(b'M', b'README.md', b'$(git hash-object -w '/path/to/existing/README.md')', b'100644'), | ||
FileChange(b'M', b'src/.gitignore', b'$(git hash-object -w '/home/myusers/mymodule.gitignore')', b'100644')]" | ||
``` | ||
|
||
Alternatively, you could also use the [insert-beginning contrib script](../contrib/filter-repo-demos/insert-beginning). | ||
|
||
## Purge a large list of files | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/63 --> | ||
|
||
Stick all the files in some file (one per line), | ||
e.g. ../DELETED_FILENAMES.txt, and then run | ||
|
||
``` | ||
git filter-repo --invert-paths --paths-from-file ../DELETED_FILENAMES.txt | ||
``` | ||
|
||
## Extracting a libary to a separate repo | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/80 --> | ||
|
||
``` | ||
git filter-repo \ | ||
--path src/some-folder/some-feature \ | ||
--path-rename src/some-folder/some-feature/:src/ | ||
``` | ||
|
||
## Replace words in all commit messages | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/83 --> | ||
|
||
``` | ||
git-filter-repo --message-callback 'return message.replace(b"stuff", b"task")' | ||
``` | ||
|
||
## Only keep files from two branches | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/91 --> | ||
|
||
Let's say you know that the files currently present on two branches | ||
are the only files that matter. Files that used to exist in either of | ||
these branches, or files that only exist on some other branch, should | ||
all be deleted from all versions of history. This can be accomplished | ||
by getting a list of files from each branch, combining them, sorting | ||
the list and picking out just the unique entries, then passing to | ||
`--paths-from-file`: | ||
|
||
``` | ||
git ls-tree -r ${BRANCH1} >../my-files | ||
git ls-tree -r ${BRANCH2} >>../my-files | ||
sort ../my-files | uniq >../my-relevant-files | ||
git filter-repo --paths-from-file ../my-relevant-files | ||
``` | ||
|
||
## Renormalize end-of-line characters and add a .gitattributes | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/122 --> | ||
|
||
``` | ||
contrib/filter-repo-demos/lint-history dos2unix | ||
[edit .gitattributes] | ||
contrib/filter-repo-demos/insert-beginning .gitattributes | ||
``` | ||
|
||
## Remove spaces at the end of lines | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/145 --> | ||
|
||
Removing all spaces at the end of lines of non-binary files, including | ||
stripping trailing carriage returns: | ||
|
||
``` | ||
git filter-repo --replace-text <(echo 'regex:[\r\t ]+(\n|$)==>\n') | ||
``` | ||
|
||
## Having both exclude and include rules for filenames | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/230 --> | ||
|
||
If you want to have rules to both include and exclude filenames, you | ||
can simply invoke `git filter-repo` multiple times. Alternatively, | ||
you can dispense with `--path` arguments and instead use the more | ||
generic `--filename-callback`. For example to include all files under | ||
`src/` except for `src/README.md`: | ||
|
||
``` | ||
git filter-repo --filename-callback ' | ||
if filename == b"src/README.md": | ||
return None | ||
if filename.startswith(b"src/"): | ||
return filename | ||
return None' | ||
``` | ||
|
||
## Removing paths with a certain extension | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/274 --> | ||
|
||
``` | ||
git filter-repo --invert-paths --path-glob '*.xsa' | ||
``` | ||
|
||
or | ||
|
||
``` | ||
git filter-repo --filename-callback ' | ||
if filename.endswith(b".xsa"): | ||
return None | ||
return filename' | ||
``` | ||
|
||
## Removing a directory | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/278 --> | ||
|
||
``` | ||
git filter-repo --path node_modules/electron/dist/ --invert-paths | ||
``` | ||
|
||
## Convert from NFD filenames to NFC | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/296 --> | ||
|
||
Given that Mac does utf-8 normalization of filenames, and has | ||
historically switched which kind of normalization it does, users may | ||
have committed files with alternative normalizations to their | ||
repository. If someone wants to convert filenames in NFD form to NFC, | ||
they could run | ||
|
||
``` | ||
git filter-repo --filename-callback ' | ||
try: | ||
return subprocess.check_output("iconv -f utf-8-mac -t utf-8".split(), | ||
input=filename) | ||
except: | ||
return filename | ||
' | ||
``` | ||
|
||
or | ||
|
||
``` | ||
git filter-repo --filename-callback ' | ||
import unicodedata | ||
try: | ||
return bytearray(unicodedata.normalize('NFC', filename.decode('utf-8')), 'utf-8') | ||
except: | ||
return filename | ||
' | ||
``` | ||
|
||
## Set the committer of the last few commits to myself | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/379 --> | ||
|
||
``` | ||
git filter-repo --refs main~5..main --commit-callback ' | ||
commit.commiter_name = b"My Wonderful Self" | ||
commit.committer_email = b"[email protected]" | ||
' | ||
``` | ||
|
||
## Handling special characters, e.g. accents in names | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/383 --> | ||
|
||
Since characters like ë and á are multi-byte characters and python | ||
won't allow you to directly place those in a bytestring | ||
(e.g. b"Raphaël González" would result in a `SyntaxError: bytes can | ||
only contain ASCII literal characters` error from Python), you just | ||
need to make a normal string and then convert to a bytestring to | ||
handle these. For example, changing the author name and email where | ||
the author email is currently `[email protected]`: | ||
|
||
``` | ||
git filter-repo --refs main~5..main --commit-callback ' | ||
if commit.author_email = b"[email protected]": | ||
commit.author_name = "Raphaël González".encode() | ||
commit.author_email = b"[email protected]" | ||
' | ||
``` | ||
|
||
## Handling repository corruption | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/420 --> | ||
|
||
First, run fsck to get a list of the corrupt objects, e.g.: | ||
``` | ||
$ git fsck | ||
error in commit 166f57b3fbe31257100361ecaf735f305b533b21: missingSpaceBeforeDate: invalid author/committer line - missing space before date | ||
Checking object directories: 100% (256/256), done. | ||
``` | ||
|
||
Then print out that object literally to a temporary file: | ||
``` | ||
$ git cat-file -p 166f57b3fbe31257100361ecaf735f305b533b21 >tmp | ||
``` | ||
|
||
Taking a look at the file would show, for example: | ||
``` | ||
$ cat tmp | ||
tree e1d871155fce791680ec899fe7869067f2b4ffd2 | ||
author My Name <[email protected]>1673287380 -0800 | ||
committer My Name <[email protected]> 1673287380 -0800 | ||
Initial | ||
``` | ||
|
||
Edit that file to fix the error (in this case, the missing space | ||
between author email and author date): | ||
|
||
``` | ||
tree e1d871155fce791680ec899fe7869067f2b4ffd2 | ||
author My Name <[email protected]> 1673287380 -0800 | ||
committer My Name <[email protected]> 1673287380 -0800 | ||
Initial | ||
``` | ||
|
||
Save the updated file, then use `git-replace` to make a replace reference | ||
for it. | ||
``` | ||
$ git replace -f 166f57b3fbe31257100361ecaf735f305b533b21 $(git hash-object -t commit -w tmp) | ||
``` | ||
|
||
Then remove the temporary file `tmp` and run `filter-repo` to consume | ||
the replace reference and make it permanent: | ||
|
||
``` | ||
$ rm tmp | ||
$ git filter-repo --proceed | ||
``` | ||
|
||
Note that if you have multiple corrupt objects, you only need to run | ||
filter-repo once; just wait to do that step until you have all the | ||
replacements in place. | ||
|
||
## Removing all files with a backslash in them | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/427 --> | ||
|
||
``` | ||
git filter-repo --filename-callback 'return None if b'\\' in filename else filename' | ||
``` | ||
|
||
## Replace a binary blob in history | ||
|
||
Either | ||
|
||
``` | ||
git filter-repo --blob-callback ' | ||
if blob.original_id == b"<hash of the bad object>": | ||
blob.data = open("<path to the replacement file>", "rb").read() | ||
' | ||
``` | ||
|
||
or | ||
|
||
``` | ||
``` | ||
|
||
|
||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/436 --> | ||
replace a binary blob in history | ||
|
||
<!-- https://github.com/newren/git-filter-repo/pull/542 --> | ||
callback for lint-history | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/300 --> | ||
using replace refs to delete old history | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/492 --> | ||
replacing pngs with compressed alternative | ||
(#537 also used a change.blob_id thingy) | ||
|
||
<!-- https://github.com/newren/git-filter-repo/issues/490 --> | ||
<!-- https://github.com/newren/git-filter-repo/issues/504 --> | ||
need for a multi-step filtering to avoid path collisions or ordering issues | ||
|
||
<!-- https://lore.kernel.org/git/CABPp-BFqbiS8xsbLouNB41QTc5p0hEOy-EoV0Sjnp=xJEShkTw@mail.gmail.com/ --> | ||
Two things: | ||
textwrap.dedent | ||
easier example of using git-filter-repo as a library |