Fix bug in prefill_chunk_size that ignores disable_compile flag #38067
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a bug in the
prefill_chunking
function where the compilation check is not performed before callingget_compiled_call()
.When using
prefill_chunk_size > 0
in aGenerationConfig
, the model's forward function is always compiled, even ifdisable_compile=True
is specified. This happens because theprefill_chunking
function directly callsget_compiled_call()
without checking if compilation should occur:Modified to:
This ensures that models aren't compiled when
disable_compile=True is set
.Also fixed a typo in the error message ("chunkink" -> "chunking"), lol
Before submitting
Yes, but this is still my first PR here, hope it's ok
As far as I understand it, there's no need to change the documentation
No, tested with a simple generation script using prefill_chunk_size=8 and disable_compile=True. Before the fix, torch.compile was being called despite disable_compile=True. After the fix, no compilation occurs as expected.
Who can review?
@gante