Skip to content

Commit 0d57110

Browse files
Update max_length explanation for VLM in online trainers (#4220)
Co-authored-by: Quentin Gallouédec <[email protected]>
1 parent 4995b24 commit 0d57110

File tree

2 files changed

+16
-4
lines changed

2 files changed

+16
-4
lines changed

docs/source/grpo_trainer.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -563,8 +563,14 @@ accelerate launch \
563563

564564
### Configuration Tips
565565

566-
> [!WARNING]
567-
> VLM training may fail if image tokens are truncated. We highly recommend disabling truncation by setting `max_prompt_length` to `None`.
566+
> [!TIP]
567+
> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`GRPOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
568+
>
569+
> ```python
570+
> GRPOConfig(max_prompt_length=None, ...)
571+
> ```
572+
>
573+
> Only use `max_prompt_length` when you've verified that truncation won't remove image tokens for the entire dataset.
568574
569575
- Use LoRA on vision-language projection layers
570576
- Enable 4-bit quantization to reduce memory usage

docs/source/rloo_trainer.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -545,8 +545,14 @@ accelerate launch \
545545

546546
### Configuration Tips
547547

548-
> [!WARNING]
549-
> VLM training may fail if image tokens are truncated. We highly recommend disabling truncation by setting `max_prompt_length` to `None`.
548+
> [!TIP]
549+
> For VLMs, truncating may remove image tokens, leading to errors during training. To avoid this, set `max_prompt_length=None` in the [`RLOOConfig`]. This allows the model to process the full sequence length without truncating image tokens.
550+
>
551+
> ```python
552+
> RLOOConfig(max_prompt_length=None, ...)
553+
> ```
554+
>
555+
> Only use `max_prompt_length` when you've verified that truncation won't remove image tokens for the entire dataset.
550556
551557
- Use LoRA on vision-language projection layers
552558
- Enable 4-bit quantization to reduce memory usage

0 commit comments

Comments
 (0)