fix(python): dispatch vindex infer through inference weights by citizenu03bb · Pull Request #174 · chrishayuk/larql

citizenu03bb · 2026-06-29T14:35:58Z

Title:
fix(python): dispatch vindex infer through inference weights

Summary:

Make Python Vindex lazy-load InferenceWeights instead of dense-only mmap weights for infer.
Route infer and infer_trace through InferenceWeights::infer_patched.
Keep dense-only analysis helpers using InferenceWeights::as_weights().

Why:
The Python Vindex.infer path used the dense mmap loader directly. That loader
does not understand Q4K/kquant vindex weight files, so quantized indexes could
load through the wrong path and produce bad predictions/traces. The shared
InferenceWeights loader already detects the format and dispatches to dense or
Q4K inference as appropriate.

Verification:

cargo check -p larql-python
cargo test -p larql-python --lib

Branch:
pr/python-kquant-infer-dispatch

Commit:
e4fe4fd

fix(python): dispatch vindex infer through inference weights

e4fe4fd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(python): dispatch vindex infer through inference weights#174

fix(python): dispatch vindex infer through inference weights#174
citizenu03bb wants to merge 1 commit into
chrishayuk:mainfrom
citizenu03bb:pr/python-kquant-infer-dispatch

citizenu03bb commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

citizenu03bb commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant