Add indexCache training support by faresobeid · Pull Request #2541 · PrimeIntellect-ai/prime-rl

faresobeid · 2026-05-18T13:38:29Z

Can use with

[trainer.model]
impl = "custom"
use_index_cache = true
index_topk_freq = 4

[inference.vllm_extra.hf_overrides]
use_index_cache = true
index_topk_freq = 4

Note

Medium Risk
Touches core attention forward paths and introduces cross-layer state (cached indices), which could affect correctness/performance if misconfigured or if assumptions about layer scheduling break.

Overview
Enables DSA IndexCache in training by adding use_index_cache, index_topk_freq, and optional index_topk_pattern to the trainer ModelConfig and propagating them into the loaded HF model_config.

Updates the custom glm_moe_dsa implementation so decoder layers can reuse sparse attention top-k indices across layers: attention/decoder forwards now thread a cached_indices tensor through the stack, and a new per-layer skip policy (_index_cache_skip_topk) controls when indices are recomputed vs reused.

^{Reviewed by Cursor Bugbot for commit 563b77f. Bugbot is set up for automated code reviews on this repo. Configure here.}

Co-authored-by: faresobeid <faresobeid@users.noreply.github.com>

Signed-off-by: faresobeid <111092724+faresobeid@users.noreply.github.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 563b77f. Configure here.}

Add indexCache training support

b10257e

cursor Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/prime_rl/trainer/models/glm_moe_dsa/sparse_mla_attention.py Outdated

cursoragent and others added 2 commits May 18, 2026 14:31

fix glm moe dsa index cache threading

f1e1a5e

Co-authored-by: faresobeid <faresobeid@users.noreply.github.com>

Delete tests/unit/train/models/test_glm_moe_dsa_index_cache.py

9fb996a

Signed-off-by: faresobeid <111092724+faresobeid@users.noreply.github.com>

cursor Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/prime_rl/trainer/models/glm_moe_dsa/modeling_glm_moe_dsa.py Outdated

fix

563b77f

cursor Bot reviewed May 18, 2026

View reviewed changes

Comment thread src/prime_rl/trainer/models/glm_moe_dsa/sparse_mla_attention.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add indexCache training support#2541

Add indexCache training support#2541
faresobeid wants to merge 4 commits into
mainfrom
indexcache-train

faresobeid commented May 18, 2026 •

edited by cursor Bot

Loading

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

faresobeid commented May 18, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

faresobeid commented May 18, 2026 •

edited by cursor Bot

Loading