You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when using transformers.pipeline to load a PEFT fine-tuned model (e.g., "ybelkada/opt-350m-lora"), the pipeline loads only the base model without applying the LoRA adapters. This behavior is misleading because users would expect to get the fine-tuned version of the model rather than just the base model.
This code loads the base OPT model instead of the LoRA fine-tuned opt-350m-lora model.
I propose one of the following improvements:
Add a warning message when loading a PEFT fine-tuned model without applying LoRA adapters. For example:
"Warning: You are loading a LoRA fine-tuned model, but the LoRA adapters have not been applied. Use PeftModel.from_pretrained() to correctly load the fine-tuned version."
Automatically detect and apply LoRA adapters when using pipeline, similar to how AutoModelForCausalLM.from_pretrained() works with PEFT models.
Motivation
This issue is problematic because it creates a false assumption that users are working with the fine-tuned model when, in reality, they are only using the base model. Many users might not realize this and get incorrect results without knowing why. A clear warning or automatic application of LoRA would significantly improve user experience and reduce confusion.
Your contribution
I can help test the implementation if needed. Let me know if there are any specific areas where contributions would be useful.
The text was updated successfully, but these errors were encountered:
Feature request
Currently, when using transformers.pipeline to load a PEFT fine-tuned model (e.g., "ybelkada/opt-350m-lora"), the pipeline loads only the base model without applying the LoRA adapters. This behavior is misleading because users would expect to get the fine-tuned version of the model rather than just the base model.
For example:
This code loads the base OPT model instead of the LoRA fine-tuned opt-350m-lora model.
I propose one of the following improvements:
Motivation
This issue is problematic because it creates a false assumption that users are working with the fine-tuned model when, in reality, they are only using the base model. Many users might not realize this and get incorrect results without knowing why. A clear warning or automatic application of LoRA would significantly improve user experience and reduce confusion.
Your contribution
I can help test the implementation if needed. Let me know if there are any specific areas where contributions would be useful.
The text was updated successfully, but these errors were encountered: