Releases: netdur/llama_cpp_dart
Releases · netdur/llama_cpp_dart
v0.9.0-dev.8
v0.9.0-dev.8 — MTP speculative decoding, IQ4_NL KV cache
v0.9.0-dev.7
0.9.0-dev.7: b9360, embeddings, speculative decoding, broad API coverage
v0.9.0-dev.6
params: expose missing llama.cpp options across sampler/context/model - SamplerParams: Mirostat (v1/v2), grammar (incl. lazy patterns), DRY, XTC, dynamic temperature, adaptive-P, top-n-sigma, infill, logit bias, shared min_keep. SamplerFactory.build now accepts model: for vocab-dependent stages. - ContextParams: RoPE scaling/freq, full YaRN knobs, pooling and attention type, defrag threshold, no_perf, op_offload, swa_full, kv_unified. - ModelParams: split mode, main GPU, tensor split, device list, GGUF kv overrides (int/float/bool/str), use_direct_io, use_extra_bufts, no_host, no_alloc. - README: link to aichat sample app. - Bump to 0.9.0-dev.6 + CHANGELOG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v0.9.0-dev.5
0.9.0-dev.5: probe fix follow-up to dev.4 (analyzer-clean)
v0.9.0-dev.2
0.9.0-dev.2: M10 gaps closed (KV quant, log capture, primary-accel, A…
v0.9.0-dev.1
v0.9.0-dev.1: M8.5 Hexagon AAR + CI permission fix Adds: - Android Hexagon NPU + OpenCL AAR (build via Snapdragon Docker toolchain). Six HTP DSP variants (v68 through v81) covering Snapdragon 865 -> 8 Elite. ~3.7 MB stripped. Fixes: - Workflow GITHUB_TOKEN didn't have contents:write so the v0.9.0-dev.0 build failed at the release-asset upload step. Added explicit permissions block. NOT yet validated on a Snapdragon device — Hexagon AAR ships experimental until tested end-to-end on hardware.