DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Roysky · 2025-02-26T01:26:55Z

after DeepSeek 671B Fine-Tuning , how to load the Original model and lora model to test ?

help ~

lean-wang · 2025-02-26T12:25:07Z

We used the following code for the conversion:

from peft import PeftModel, PeftConfig
from transformers import AutoModel, AutoTokenizer
import torch

# Load configuration and model
peft_model_id = "./DeepSeek-R1-bf16-lora/lora"
peft_config = PeftConfig.from_pretrained(peft_model_id, trust_remote_code=True)
base_model = AutoModel.from_pretrained(peft_config.base_model_name_or_path, trust_remote_code=True)

# Load the model with LoRA parameters
model = PeftModel.from_pretrained(base_model, peft_model_id)

# Merge LoRA weights into the base model
model = model.merge_and_unload()

# Now, 'model' contains the merged weights and can be used for inference or saved as a new model
model.save_pretrained("./model_merge")

Our machine has 2TB of memory, The swap space is 2TB, and we have attempted to merge the LoRA weights, but it reported missing weights. There are over 1000 layers of weights missing after merging the LoRA weights compared to before..

wotulong · 2025-02-26T16:30:08Z

what's your fine tune env? thks

tangzhenkun01 · 2025-03-05T17:49:02Z

When merging LoRA weights on a machine with 8 GPUs and 1.7T memory, I keep getting OOM (Out Of Memory) errors. What should I do?

TongLi3701 · 2025-03-06T03:38:46Z

Please have a look at this example: https://discuss.huggingface.co/t/help-with-merging-lora-weights-back-into-base-model/40968.

We are working on providing a straightforward example on the github soon.

duanjunwen mentioned this issue Mar 6, 2025

[HotFix] update load lora model Readme; #6240

Merged

11 tasks

TongLi3701 closed this as completed in #6240 Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Roysky commented Feb 26, 2025

lean-wang commented Feb 26, 2025 •

edited

Loading

wotulong commented Feb 26, 2025

tangzhenkun01 commented Mar 5, 2025

TongLi3701 commented Mar 6, 2025

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

DeepSeek 671B Fine-Tuning , how to merge lora model #6222

Comments

Roysky commented Feb 26, 2025

lean-wang commented Feb 26, 2025 • edited Loading

wotulong commented Feb 26, 2025

tangzhenkun01 commented Mar 5, 2025

TongLi3701 commented Mar 6, 2025

lean-wang commented Feb 26, 2025 •

edited

Loading