Can we use multiple loras at once? #203

Vigilence · 2026-02-06T01:26:53Z

Vigilence
Feb 6, 2026

Is it possible to use multiple loras at once?
Can I determine the strength of each lora? For example at 100% its full strength, at 50% its half strength.

sigalarm · 2026-02-06T14:06:06Z

sigalarm
Feb 6, 2026

ACE-Step 1.5 — LoRA Capabilities Guide

Overview

This guide documents the current LoRA (Low-Rank Adaptation) capabilities in ACE-Step 1.5, including strength control, current limitations around multi-LoRA usage, and an analysis of what would be required to implement multi-LoRA support.

Current LoRA Capabilities

What's Supported

Feature	Status	Details
Load a single LoRA	Yes	PEFT adapter format from a directory path
Unload LoRA	Yes	Restores base decoder from backup
Per-LoRA strength control	Yes	Global 0.0–1.0 scale slider
Enable/disable toggle	Yes	Checkbox to toggle without unloading
LoRA training	Yes	Full pipeline via Lightning Fabric
LoRA export	Yes	Standard PEFT adapter directory format
Multiple LoRAs at once	No	Loading a new LoRA unloads the previous one
Merge LoRA into base	Code exists	`merge_lora_weights()` in lora_utils.py, but not exposed in UI/API

Strength Control (LoRA Scale)

How It Works

The LoRA Scale slider (0.0 to 1.0, step 0.05, default 1.0) controls the influence strength of the loaded LoRA adapter.

UI Location: Service & Configuration section, below the LoRA path input.

Behavior at different values:

Scale	Effect
1.0	Full LoRA effect — the adapter's trained style is fully applied
0.75	75% strength — most of the style with slight base model bleed-through
0.5	Half strength — balanced blend of LoRA style and base model
0.25	Quarter strength — subtle hint of the LoRA style
0.0	LoRA effectively disabled — equivalent to base model output

Implementation Mechanism

The scale works by multiplying PEFT's internal scaling attribute on every LoRA-injected module. From handler.py lines 256–294:

def set_lora_scale(self, scale: float) -> str:
    self.lora_scale = max(0.0, min(1.0, scale))  # Clamp to 0-1

    for name, module in self.model.decoder.named_modules():
        if hasattr(module, 'scaling'):
            scaling = module.scaling
            if isinstance(scaling, dict):
                # Save original scaling on first call
                if not hasattr(module, '_original_scaling'):
                    module._original_scaling = {k: v for k, v in scaling.items()}
                # Multiply original by user scale
                for adapter_name in scaling:
                    module.scaling[adapter_name] = (
                        module._original_scaling[adapter_name] * self.lora_scale
                    )
            elif isinstance(scaling, (int, float)):
                if not hasattr(module, '_original_scaling'):
                    module._original_scaling = scaling
                module.scaling = module._original_scaling * self.lora_scale

Each LoRA-injected layer has a base scaling factor determined by alpha / rank (set during training). The user's scale is applied as an additional multiplier on top of that. So at 50%, each LoRA layer contributes half of its trained effect.

Training-Time Factors That Affect Strength

The effective strength of a LoRA at inference time is also influenced by its training configuration:

Training Parameter	Default	Effect on Strength
Rank (r)	8	Higher rank = more capacity, potentially stronger effect
Alpha	16 (2x rank)	Higher alpha relative to rank = stronger per-layer scaling
Target Modules	q_proj, k_proj, v_proj, o_proj	More modules = broader influence
Training Epochs	100	More epochs can produce stronger style imprint
Dropout	0.1	Higher dropout = more regularized, potentially softer effect

The PEFT internal scaling per layer is alpha / rank. With defaults of 16/8 = 2.0, each LoRA layer scales its contribution by 2x before the user's inference-time scale is applied.

Multiple LoRAs: Current Limitation

What Happens Today

Loading a new LoRA always unloads the previous one. The implementation in handler.py lines 166–175:

def load_lora(self, lora_path: str) -> str:
    # Backup base decoder if not already backed up
    if self._base_decoder is None:
        self._base_decoder = copy.deepcopy(self.model.decoder)
    else:
        # Restore base decoder before loading new LoRA
        self.model.decoder = copy.deepcopy(self._base_decoder)

The state is tracked with single-valued variables (not lists):

self.lora_loaded = False      # Single boolean
self.use_lora = False         # Single boolean
self.lora_scale = 1.0         # Single float
self._base_decoder = None     # Single backup

There is no mechanism to stack, blend, or switch between multiple loaded adapters.

What Multi-LoRA Would Require

Approach 1: PEFT Native Multi-Adapter (Recommended)

PEFT already supports loading multiple named adapters into a single PeftModel. This is the most natural path for ACE-Step.

PEFT's built-in capabilities:

PeftModel.load_adapter(path, adapter_name="style_b") — load additional adapters
PeftModel.set_adapter("style_b") — switch active adapter
PeftModel.add_weighted_adapter(["style_a", "style_b"], [0.6, 0.4], "combined") — weighted merge

Changes required in handler.py:

Current state (single LoRA):
    self.lora_loaded = False
    self.use_lora = False
    self.lora_scale = 1.0
    self._base_decoder = None

Required state (multi-LoRA):
    self.loaded_adapters = {}        # Dict[name, {"path": str, "scale": float, "active": bool}]
    self._base_decoder = None        # Still needed for full unload

New methods needed:

Method	Purpose
`load_adapter(path, name)`	Load additional adapter by name
`remove_adapter(name)`	Remove specific adapter
`set_adapter_scale(name, scale)`	Per-adapter strength control
`set_active_adapters(names)`	Select which adapters are active
`merge_active_adapters(weights)`	Create weighted combination

Example implementation sketch for load_adapter:

def load_adapter(self, lora_path: str, adapter_name: str = None) -> str:
    if self._base_decoder is None:
        self._base_decoder = copy.deepcopy(self.model.decoder)

    if adapter_name is None:
        adapter_name = os.path.basename(lora_path)

    if not self.loaded_adapters:
        # First adapter: wrap decoder with PeftModel
        self.model.decoder = PeftModel.from_pretrained(
            self.model.decoder, lora_path,
            adapter_name=adapter_name, is_trainable=False
        )
    else:
        # Additional adapters: use load_adapter
        self.model.decoder.load_adapter(lora_path, adapter_name=adapter_name)

    self.loaded_adapters[adapter_name] = {
        "path": lora_path, "scale": 1.0, "active": True
    }
    return f"Loaded adapter '{adapter_name}'"

Example implementation sketch for weighted merge:

def merge_adapters_weighted(self, weights: Dict[str, float]) -> str:
    adapter_names = list(weights.keys())
    adapter_weights = [weights[n] for n in adapter_names]

    # PEFT's add_weighted_adapter creates a new virtual adapter
    self.model.decoder.add_weighted_adapter(
        adapter_names, adapter_weights,
        combination_type="linear",
        adapter_name="merged"
    )
    self.model.decoder.set_adapter("merged")
    return f"Merged {adapter_names} with weights {adapter_weights}"

Approach 2: Sequential Merge-and-Bake

A simpler but less flexible approach: merge each LoRA into the base weights permanently, then load the next.

1. Load LoRA_A at 60% strength → merge into weights → now the "base" includes LoRA_A
2. Load LoRA_B at 40% strength → merge into weights → now includes both
3. Generate audio

Pros: Simple to implement, no PEFT multi-adapter complexity.
Cons: Not reversible without reloading the original base model. Order-dependent (A+B may differ from B+A). Cannot adjust individual strengths after merging.

The existing merge_lora_weights() function in lora_utils.py (line 363) already supports this:

def merge_lora_weights(model) -> Any:
    if hasattr(model, 'decoder') and hasattr(model.decoder, 'merge_and_unload'):
        model.decoder = model.decoder.merge_and_unload()
    return model

This function exists but is not exposed in the UI or API.

UI Changes Required

For either approach, the Gradio UI would need updates in generation.py:

Current UI (single LoRA):

┌─────────────────────────────────────────┐
│ LoRA Path:        [text input]          │
│ [Load LoRA] [Unload]                    │
│ [x] Use LoRA      [====Scale====] 1.0   │
│ Status: LoRA loaded from /path/...      │
└─────────────────────────────────────────┘

Proposed UI (multi-LoRA):

┌─────────────────────────────────────────┐
│ Add LoRA: [path input] [name] [+ Add]   │
│                                         │
│ Loaded Adapters:                        │
│ ┌─────────────────────────────────────┐ │
│ │ [x] jazz_style    [====] 0.60  [x]  │ │
│ │ [x] vocal_boost   [====] 0.80  [x]  │ │
│ │ [ ] retro_synth   [====] 1.00  [x]  │ │
│ └─────────────────────────────────────┘ │
│ [Merge Active] [Unload All]             │
│ Status: 2 adapters active, 1 inactive   │
└─────────────────────────────────────────┘

Each row would have:

Enable/disable checkbox
Adapter name
Individual scale slider (0.0–1.0)
Remove button

Files That Would Need Changes

File	Changes
`acestep/handler.py`	Replace single-LoRA state with multi-adapter dict; new load/remove/merge methods
`acestep/training/lora_utils.py`	Expose `merge_lora_weights()` properly; add multi-adapter merge helpers
`acestep/gradio_ui/interfaces/generation.py`	Multi-LoRA UI components (dynamic list, per-adapter sliders)
`acestep/gradio_ui/events/__init__.py`	Wire new UI components to handler methods
`acestep/gradio_ui/i18n/en.json` (+ zh, ja)	Localization strings for new UI elements
`acestep/api_server.py`	API endpoints for adapter management and per-request adapter selection
`acestep/inference.py`	Add adapter selection fields to `GenerationParams`

VRAM Considerations

Each loaded LoRA adapter adds a small amount of VRAM:

A rank-8 LoRA on q/k/v/o_proj layers is typically 2–10 MB per adapter
A rank-64 LoRA is typically 15–80 MB per adapter
Loading 3–5 adapters simultaneously is feasible even on 8GB VRAM GPUs
The base decoder backup (_base_decoder) already uses a copy.deepcopy, which is the largest memory cost

Multi-LoRA is memory-feasible. The main cost is the base decoder backup, which is already paid.

Practical Workaround: Manual Sequential Merging

Until multi-LoRA is implemented, users can achieve a similar effect manually:

Step 1: Merge First LoRA at Reduced Strength

Load LoRA A (e.g., jazz style)
Set scale to desired strength (e.g., 0.6)
Generate audio — the output has 60% jazz influence

Step 2: Use Output as Input for Second LoRA

Unload LoRA A, load LoRA B (e.g., vocal enhancement)
Use the cover task with the Step 1 output as source audio
Set audio_cover_strength high (0.8–0.9) to preserve the jazz character
Set LoRA scale to desired strength for LoRA B

This is a serial process and will produce slightly different results than true simultaneous multi-LoRA, but it's functional today.

Summary

Question	Answer
Can I use multiple LoRAs at once?	Not currently. One LoRA at a time.
Can I control LoRA strength?	Yes. 0.0–1.0 scale slider, where 1.0 = full and 0.5 = half.
Is multi-LoRA technically feasible?	Yes. PEFT supports it natively; ACE-Step's wrapper doesn't expose it yet.
What's the recommended implementation path?	PEFT's `load_adapter()` + `add_weighted_adapter()` for named multi-adapter support.
Is there a workaround today?	Serial generation: apply one LoRA, use output as cover input for the next.

Key Source Files

File	Relevance
`acestep/handler.py` (lines 86–306)	Runtime LoRA management: load, unload, scale, enable/disable
`acestep/training/lora_utils.py`	LoRA injection, saving, loading, merging utilities
`acestep/training/configs.py`	`LoRAConfig` dataclass (rank, alpha, dropout, target_modules)
`acestep/gradio_ui/interfaces/generation.py` (lines 191–223)	UI components for LoRA path, buttons, scale slider
`acestep/gradio_ui/events/__init__.py` (lines 75–105)	Event handler wiring for LoRA UI components

0 replies

Vigilence · 2026-02-06T19:53:57Z

Vigilence
Feb 6, 2026
Author

@sigalarm Thank you for such a thorough and well written reply. This information is gold but I feel it’s wasted if you only share here. If you can make a ticket for multiple Lora’s request with this information that would be super!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we use multiple loras at once? #203

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Can we use multiple loras at once? #203

Uh oh!

Vigilence Feb 6, 2026

Replies: 2 comments

Uh oh!

sigalarm Feb 6, 2026

ACE-Step 1.5 — LoRA Capabilities Guide

Overview

Current LoRA Capabilities

What's Supported

Strength Control (LoRA Scale)

How It Works

Implementation Mechanism

Training-Time Factors That Affect Strength

Multiple LoRAs: Current Limitation

What Happens Today

What Multi-LoRA Would Require

Approach 1: PEFT Native Multi-Adapter (Recommended)

Approach 2: Sequential Merge-and-Bake

UI Changes Required

Files That Would Need Changes

VRAM Considerations

Practical Workaround: Manual Sequential Merging

Step 1: Merge First LoRA at Reduced Strength

Step 2: Use Output as Input for Second LoRA

Summary

Key Source Files

Uh oh!

Vigilence Feb 6, 2026 Author

Vigilence
Feb 6, 2026

sigalarm
Feb 6, 2026

Vigilence
Feb 6, 2026
Author