Skip to content

Use dynamic FLUX T5 seq len (small speed / mem improvement)#7400

Closed
RyanJDick wants to merge 2 commits intomainfrom
ryan/flux-dynamic-seq-len
Closed

Use dynamic FLUX T5 seq len (small speed / mem improvement)#7400
RyanJDick wants to merge 2 commits intomainfrom
ryan/flux-dynamic-seq-len

Conversation

@RyanJDick
Copy link
Copy Markdown
Contributor

Summary

This PR adds the option to dynamically select a T5 sequence length based on the length of the input prompt.

Using a smaller sequence length can improve inference speed and reduce peak memory, but will produce slightly different outputs.

Speed Improvement

Config: 1024x1024, bf16

T5 seq len = 512: 0.481 secs / it
T5 seq len = 128: 0.461 secs / it (4.1% speedup)

Memory improvement

Config: 1024x1024, bf16

T5 seq len = 512: 23355.16 MB peak VRAM
T5 seq len = 128: 23292.12 MB peak VRAM (63 MB saved)

Image difference

TODO: Add sample images showing the difference in output as T5 sequence len is varied.

Related Issues / Discussions

QA Instructions

  • Test varied prompt lengths:
    • ~50 (t5 seq len of 128)
    • 127 (t5 seq len of 128)
    • 128 (t5 seq len of 128)
    • 129 (t5 seq len of 256)
    • ~250 (t5 seq len of 256)
    • ~500 (t5 seq len of 512)
  • Test with regional prompts of varied lengths
  • Test compatibility with:
    • ControlNet
    • IP-Adapter
    • LoRA

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions Bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files labels Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend PRs that change backend files invocations PRs that change invocations python PRs that change python files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants