Activity
Fix: Prioritize default head_dim when provided by architecture (Gemma…
Fix: Prioritize default head_dim when provided by architecture (Gemma…
Fix: Correctly read query_pre_attn_scalar from text_config (Gemma3)
Fix: Correctly read query_pre_attn_scalar from text_config (Gemma3)
Merge remote-tracking branch 'origin/dev' into dev
Merge remote-tracking branch 'origin/dev' into dev
Update chat.py, include multi-line input support and context clearing…
Update chat.py, include multi-line input support and context clearing…
Pull request merge
Support partial_rotary_factor (Phi-4 mini)
Support partial_rotary_factor (Phi-4 mini)
Fix alt pos embeddings and block diagonal mask when flash-attn is dis…
Fix alt pos embeddings and block diagonal mask when flash-attn is dis…
Test script: Allow --eval_rows in wiki2 ppl test
Test script: Allow --eval_rows in wiki2 ppl test
Fix compilation errors on aarch64
Fix compilation errors on aarch64
Don't compile AVX2 functions when building without AVX2 support
Don't compile AVX2 functions when building without AVX2 support