Replies: 2 comments
-
ACE-Step 1.5 — LoRA Capabilities GuideOverviewThis guide documents the current LoRA (Low-Rank Adaptation) capabilities in ACE-Step 1.5, including strength control, current limitations around multi-LoRA usage, and an analysis of what would be required to implement multi-LoRA support. Current LoRA CapabilitiesWhat's Supported
Strength Control (LoRA Scale)How It WorksThe LoRA Scale slider (0.0 to 1.0, step 0.05, default 1.0) controls the influence strength of the loaded LoRA adapter. UI Location: Service & Configuration section, below the LoRA path input. Behavior at different values:
Implementation MechanismThe scale works by multiplying PEFT's internal def set_lora_scale(self, scale: float) -> str:
self.lora_scale = max(0.0, min(1.0, scale)) # Clamp to 0-1
for name, module in self.model.decoder.named_modules():
if hasattr(module, 'scaling'):
scaling = module.scaling
if isinstance(scaling, dict):
# Save original scaling on first call
if not hasattr(module, '_original_scaling'):
module._original_scaling = {k: v for k, v in scaling.items()}
# Multiply original by user scale
for adapter_name in scaling:
module.scaling[adapter_name] = (
module._original_scaling[adapter_name] * self.lora_scale
)
elif isinstance(scaling, (int, float)):
if not hasattr(module, '_original_scaling'):
module._original_scaling = scaling
module.scaling = module._original_scaling * self.lora_scaleEach LoRA-injected layer has a base scaling factor determined by Training-Time Factors That Affect StrengthThe effective strength of a LoRA at inference time is also influenced by its training configuration:
The PEFT internal scaling per layer is Multiple LoRAs: Current LimitationWhat Happens TodayLoading a new LoRA always unloads the previous one. The implementation in def load_lora(self, lora_path: str) -> str:
# Backup base decoder if not already backed up
if self._base_decoder is None:
self._base_decoder = copy.deepcopy(self.model.decoder)
else:
# Restore base decoder before loading new LoRA
self.model.decoder = copy.deepcopy(self._base_decoder)The state is tracked with single-valued variables (not lists): self.lora_loaded = False # Single boolean
self.use_lora = False # Single boolean
self.lora_scale = 1.0 # Single float
self._base_decoder = None # Single backupThere is no mechanism to stack, blend, or switch between multiple loaded adapters. What Multi-LoRA Would RequireApproach 1: PEFT Native Multi-Adapter (Recommended)PEFT already supports loading multiple named adapters into a single PEFT's built-in capabilities:
Changes required in New methods needed:
Example implementation sketch for def load_adapter(self, lora_path: str, adapter_name: str = None) -> str:
if self._base_decoder is None:
self._base_decoder = copy.deepcopy(self.model.decoder)
if adapter_name is None:
adapter_name = os.path.basename(lora_path)
if not self.loaded_adapters:
# First adapter: wrap decoder with PeftModel
self.model.decoder = PeftModel.from_pretrained(
self.model.decoder, lora_path,
adapter_name=adapter_name, is_trainable=False
)
else:
# Additional adapters: use load_adapter
self.model.decoder.load_adapter(lora_path, adapter_name=adapter_name)
self.loaded_adapters[adapter_name] = {
"path": lora_path, "scale": 1.0, "active": True
}
return f"Loaded adapter '{adapter_name}'"Example implementation sketch for weighted merge: def merge_adapters_weighted(self, weights: Dict[str, float]) -> str:
adapter_names = list(weights.keys())
adapter_weights = [weights[n] for n in adapter_names]
# PEFT's add_weighted_adapter creates a new virtual adapter
self.model.decoder.add_weighted_adapter(
adapter_names, adapter_weights,
combination_type="linear",
adapter_name="merged"
)
self.model.decoder.set_adapter("merged")
return f"Merged {adapter_names} with weights {adapter_weights}"Approach 2: Sequential Merge-and-BakeA simpler but less flexible approach: merge each LoRA into the base weights permanently, then load the next. Pros: Simple to implement, no PEFT multi-adapter complexity. The existing def merge_lora_weights(model) -> Any:
if hasattr(model, 'decoder') and hasattr(model.decoder, 'merge_and_unload'):
model.decoder = model.decoder.merge_and_unload()
return modelThis function exists but is not exposed in the UI or API. UI Changes RequiredFor either approach, the Gradio UI would need updates in Current UI (single LoRA): Proposed UI (multi-LoRA): Each row would have:
Files That Would Need Changes
VRAM ConsiderationsEach loaded LoRA adapter adds a small amount of VRAM:
Multi-LoRA is memory-feasible. The main cost is the base decoder backup, which is already paid. Practical Workaround: Manual Sequential MergingUntil multi-LoRA is implemented, users can achieve a similar effect manually: Step 1: Merge First LoRA at Reduced Strength
Step 2: Use Output as Input for Second LoRA
This is a serial process and will produce slightly different results than true simultaneous multi-LoRA, but it's functional today. Summary
Key Source Files
|
Beta Was this translation helpful? Give feedback.
-
|
@sigalarm Thank you for such a thorough and well written reply. This information is gold but I feel it’s wasted if you only share here. If you can make a ticket for multiple Lora’s request with this information that would be super! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to use multiple loras at once?
Can I determine the strength of each lora? For example at 100% its full strength, at 50% its half strength.
Beta Was this translation helpful? Give feedback.
All reactions