Crash when running --blob-callback on blobs larger than ~600,000,000 bytes #616

relgukxilef · 2024-12-05T12:41:20Z

Hello, I'm trying to convert certain files in my repository from one format to another. I wrote some python code to accomplish this and am passing it to git-filter-repo's --blob-callback argument.

This seems to be working for a few thousand commits, then fast-import crashes with the message fatal: cannot truncate pack to skip duplicate: Invalid argument and writes a file fast_import_crash.

I've tried this multiple times, with different callbacks, with the repository filtered to different paths, filtering by path first, then applying my blob callback on a second run of git-filter-repo. The exact blob that it stops at differs between runs, but it always crashes at a blob that is much larger than the other ones. Above 600,000,000 bytes. Perhaps this is a known or intentional limitation of git fast-import or git-filter-repo.

  get-mark :655640
  blob
  mark :655641
  data 1507406
  blob
  mark :655642
  data 1558
  blob
  mark :655643
  data 865875
  blob
  mark :655644
* data 684724504

I have attached one such fast_import_crash file, but I have removed file and branch names, as this is a company repository.
fast_import_crash_30556.zip

The text was updated successfully, but these errors were encountered:

relgukxilef · 2024-12-05T14:02:24Z

I have tried running git-filter-repo with --blob-callback return and this finishes without issues. I have tried returning conditionally when either the input or the output of the conversion is larger than 1mb (far less than the last blob listed in fast_import_crash), but it still crashes at a 600mb blob, even though it shouldn't do anything with it. Perhaps having already updated some blobs, it fails to handle the large blob even when just passing it along?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash when running --blob-callback on blobs larger than ~600,000,000 bytes #616

Crash when running --blob-callback on blobs larger than ~600,000,000 bytes #616

relgukxilef commented Dec 5, 2024

relgukxilef commented Dec 5, 2024

Crash when running --blob-callback on blobs larger than ~600,000,000 bytes #616

Crash when running --blob-callback on blobs larger than ~600,000,000 bytes #616

Comments

relgukxilef commented Dec 5, 2024

relgukxilef commented Dec 5, 2024