Liger: Linearizing Large Language Models to Gated Recurrent Structures

Framework

Figure 1: Liger Framework

Environment

git clone --recurse-submodules https://github.com/OpenSparseLLMs/Linearization.git
conda create -n liger python=3.10
conda activate liger
pip install -r requirements
pip install flash-attn --no-build-isolation
cd third_party/flash-linear-attention
pip install -e .

Linearization

Copy your pre-trained base model directory (e.g. Meta-Llama-3-8B) to ./checkpoints/;
Modify the config file of the original Llama-3 base model to the config file of the Liger model (see ./checkpoints/liger_gla_base/config.json);
Modify the linearization settings in ./configs/config.yaml file (e.g. liger_gla.yaml);
Run the linearization script:

sh scripts/train_liger.sh

Evaluation

You need to install lm-evaluation-harness for evaluation:

cd third_party/lm-evaluation-harness
pip install -e .

python -m eval.harness --model hf \
    --model_args pretrained=/your/Liger/checkpoints/liger_base_model,peft=/your/Liger/checkpoints/lora_adapter_path \
    --tasks piqa,arc_easy,arc_challenge,hellaswag,winogrande \
    --batch_size 64 \
    --device cuda \
    --seed 0

Acknowledgements

We use the triton-implemented linear attention kernels from fla-org/flash-linear-attention. We refer to HazyResearch/lolcats to construct our linearization training processs. The evaluation is supported by lm-evaluation-harness. Sincerely thank their contributions!

Citation

If you find this repo useful, please cite and star our work:

@article{lan2025liger,
  title={Liger: Linearizing Large Language Models to Gated Recurrent Structures},
  author={Lan, Disen and Sun, Weigao and Hu, Jiaxi and Du, Jusen and Cheng, Yu},
  journal={arXiv preprint arXiv:2503.01496},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Framework

Environment

Linearization

Evaluation

Acknowledgements

Citation

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
checkpoints		checkpoints
configs		configs
eval		eval
liger		liger
lolcats		lolcats
scripts		scripts
third_party		third_party
training		training
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py

License

OpenSparseLLMs/Linearization

Folders and files

Latest commit

History

Repository files navigation

Liger: Linearizing Large Language Models to Gated Recurrent Structures

Framework

Environment

Linearization

Evaluation

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages