Prerequisites
Feature Description
Add support for Rotorquant
Motivation
There is a fork that is working on it but provides no releases, also could be great to merge to work to have even more alternatives for cache compression
Possible Implementation
https://github.com/johndpope/llama-cpp-turboquant
Prerequisites
Feature Description
Add support for Rotorquant
Motivation
There is a fork that is working on it but provides no releases, also could be great to merge to work to have even more alternatives for cache compression
Possible Implementation
https://github.com/johndpope/llama-cpp-turboquant