Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing behavior when loading PEFT models with pipeline #36473

Open
XEric7 opened this issue Feb 28, 2025 · 3 comments · May be fixed by #36480
Open

Confusing behavior when loading PEFT models with pipeline #36473

XEric7 opened this issue Feb 28, 2025 · 3 comments · May be fixed by #36480
Assignees
Labels
Feature request Request for a new feature

Comments

@XEric7
Copy link

XEric7 commented Feb 28, 2025

Feature request

Currently, when using transformers.pipeline to load a PEFT fine-tuned model (e.g., "ybelkada/opt-350m-lora"), the pipeline loads only the base model without applying the LoRA adapters. This behavior is misleading because users would expect to get the fine-tuned version of the model rather than just the base model.

For example:

import transformers
pipeline = transformers.pipeline("text-generation", model="ybelkada/opt-350m-lora")

This code loads the base OPT model instead of the LoRA fine-tuned opt-350m-lora model.

I propose one of the following improvements:

  1. Add a warning message when loading a PEFT fine-tuned model without applying LoRA adapters. For example:

    "Warning: You are loading a LoRA fine-tuned model, but the LoRA adapters have not been applied. Use PeftModel.from_pretrained() to correctly load the fine-tuned version."

  2. Automatically detect and apply LoRA adapters when using pipeline, similar to how AutoModelForCausalLM.from_pretrained() works with PEFT models.

Motivation

This issue is problematic because it creates a false assumption that users are working with the fine-tuned model when, in reality, they are only using the base model. Many users might not realize this and get incorrect results without knowing why. A clear warning or automatic application of LoRA would significantly improve user experience and reduce confusion.

Your contribution

I can help test the implementation if needed. Let me know if there are any specific areas where contributions would be useful.

@XEric7 XEric7 added the Feature request Request for a new feature label Feb 28, 2025
@Rocketknight1
Copy link
Member

Ah, that's a surprising bug! It must be specific to the pipeline loading code. Self-assigning.

@Rocketknight1 Rocketknight1 self-assigned this Feb 28, 2025
@Rocketknight1 Rocketknight1 linked a pull request Feb 28, 2025 that will close this issue
@Rocketknight1
Copy link
Member

Fix is open at #36480

@XEric7
Copy link
Author

XEric7 commented Mar 1, 2025

Fix is open at #36480

Thanks a lot for the quick fix! I really appreciate the fast response and effort in resolving this issue. 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants