context : always use non-causal attention for encoder graphs #12447

ggerganov · 2025-03-18T09:16:06Z

ggml-ci

fairydreaming · 2025-03-18T09:43:30Z

@ggerganov That won't work - the flag will be back to true during set_inputs() call when the mask is created. I think you have to move it to llama_context::encode().

ggml-ci

fairydreaming

I confirm that it fixes the KQ mask problem for T5 encoder.

…g#12447) * context : always use non-causal attention for encoder graphs ggml-ci * context : move the change to llama_context::encode() ggml-ci

context : always use non-causal attention for encoder graphs

a0554c3

ggml-ci

ggerganov requested a review from fairydreaming March 18, 2025 09:16

CISC linked an issue Mar 18, 2025 that may be closed by this pull request

Eval bug: b4882 broke t5 #12435

Closed

ggerganov mentioned this pull request Mar 18, 2025

llama : refactor llama_context, llama_kv_cache, llm_build_context (v2) #12181

Merged

context : move the change to llama_context::encode()

29acf2c

ggml-ci

fairydreaming approved these changes Mar 18, 2025

View reviewed changes

fairydreaming mentioned this pull request Mar 18, 2025

Eval bug: b4882 broke t5 #12435

Closed

ggerganov merged commit 8551c44 into master Mar 18, 2025
54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

context : always use non-causal attention for encoder graphs #12447

context : always use non-causal attention for encoder graphs #12447

Uh oh!

ggerganov commented Mar 18, 2025

Uh oh!

fairydreaming commented Mar 18, 2025

Uh oh!

fairydreaming left a comment

Uh oh!

Uh oh!

Uh oh!

context : always use non-causal attention for encoder graphs #12447

context : always use non-causal attention for encoder graphs #12447

Uh oh!

Conversation

ggerganov commented Mar 18, 2025

Uh oh!

fairydreaming commented Mar 18, 2025

Uh oh!

fairydreaming left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!