refactor: optimize audio encoding logic with caching enhancements by ChuxiJ · Pull Request #446 · ace-step/ACE-Step-1.5

ChuxiJ · 2026-02-11T10:25:49Z

Improved efficiency by implementing a caching mechanism for VAE audio encoding, allowing reuse of encoded latents for identical audio inputs.
Introduced a reference audio cache to prevent unnecessary re-encoding for shared reference audio across batch items.
Updated logging to provide clarity on the reuse of cached latents.

Files changed:

acestep/handler.py: modified audio encoding logic to incorporate caching

Summary by CodeRabbit

New Features
- Added GPU tier override dropdown to manually select optimization settings (offloading, quantization, backend).
- Added GPU device information display showing VRAM and current tier.
- Expanded backend support from restricted to all available options; vLLM now recommended across tiers.
Bug Fixes
- Improved VRAM-based batch size calculations with refined memory estimation.
- Simplified vLLM initialization logic for better reliability.
Chores
- Added internationalization support for GPU tier features in English, Hebrew, Japanese, and Chinese.

- Improved efficiency by implementing a caching mechanism for VAE audio encoding, allowing reuse of encoded latents for identical audio inputs. - Introduced a reference audio cache to prevent unnecessary re-encoding for shared reference audio across batch items. - Updated logging to provide clarity on the reuse of cached latents. Files changed: - acestep/handler.py: modified audio encoding logic to incorporate caching

coderabbitai · 2026-02-11T10:26:09Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The PR extends GPU tier configuration with new public APIs and UI-friendly metadata (labels, choices, device name retrieval), adds a Gradio tier override dropdown with associated event handlers, updates backend defaults across tiers from restrictive to permissive with vllm as recommended, introduces MPS-specific overrides, refines VRAM gating logic, and adds internationalization strings supporting the new UI.

Changes

Cohort / File(s)	Summary
GPU Tier Configuration `acestep/gpu_config.py`	Widened lm_backend_restriction from pt_mlx_only to all across tiers; promoted vllm as recommended_backend. Added public APIs: `get_gpu_device_name()`, `get_gpu_config_for_tier(tier)`, and metadata constants `GPU_TIER_LABELS`, `GPU_TIER_CHOICES`. Introduced MPS overrides for macOS when tier config is retrieved.
UI Event Handling `acestep/gradio_ui/events/__init__.py`, `acestep/gradio_ui/events/generation_handlers.py`	Introduced `on_tier_change()` handler that responds to tier dropdown changes, updates global GPU config, recalculates available backends and models, and returns Gradio updates for backend, model, batch size, duration, and GPU info. Enhanced `update_audio_components_visibility()` with defensive None/invalid batch_size handling.
UI Components `acestep/gradio_ui/interfaces/generation.py`	Added gpu_info_display Markdown row showing device name, VRAM, and tier label, and tier_dropdown selector initialized to current tier with GPU_TIER_LABELS as labels. Extended imports from gpu_config module.
Internationalization `acestep/gradio_ui/i18n/en.json`, `acestep/gradio_ui/i18n/he.json`, `acestep/gradio_ui/i18n/ja.json`, `acestep/gradio_ui/i18n/zh.json`	Added three new service translation keys across all locale files: gpu_auto_tier, tier_label, and tier_info, with localized values for each language. Minor trailing comma adjustments in existing keys.
Memory Management `acestep/handler.py`, `acestep/llm_inference.py`	Updated VRAM batch reduction to use piecewise formula (0.5 + max(0, 0.15 × duration_delta_min)) instead of linear estimate. Replaced total VRAM threshold (VRAM_SAFE_TOTAL_GB) with free VRAM check (VRAM_SAFE_FREE_GB = 2.0), simplifying vLLM gating logic.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Dropdown as Tier Dropdown
    participant Handler as on_tier_change Handler
    participant Config as GPU Config
    participant Backend as Backend Selector
    participant UI as Gradio Components

    User->>Dropdown: Selects tier
    Dropdown->>Handler: Trigger tier change event
    Handler->>Config: get_gpu_config_for_tier(tier)
    Config->>Config: Apply MPS overrides if on macOS
    Config->>Handler: Return GPUConfig
    Handler->>Handler: set_global_gpu_config()
    Handler->>Backend: Compute available backends<br/>& LM models
    Backend->>Handler: Return choices & default model
    Handler->>UI: Update backend dropdown<br/>Update model selector<br/>Update batch size<br/>Update duration<br/>Update GPU info
    UI->>User: Display reconfigured UI

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

feat: GPU compatibility tier system with boundary testing #417: Directly continues GPU tier/config system changes—both PRs modify tier defaults, labels, choices, and tier-based configuration helpers in acestep/gpu_config.py.
Merge mainline commits as of 2026/02/08 05:16 UTC with MPS optimizations and do optimization checks #322: Both PRs enhance GPU configuration and MPS-specific overrides for macOS; modifications to Apple Silicon tier config retrieval paths are related.
fix: guard vLLM and CUDA graph usage on 16GB GPUs to prevent OOM #173: Both PRs refine VRAM/vLLM gating logic in acestep/llm_inference.py and GPU memory handling in acestep/handler.py.

Poem

🐰 A dropdown hops in, so shiny and new,
GPU tiers now pick what's best just for you!
vLLM bounces brightly across every tier,
MPS hops along without any fear,
Memory guards watch with fresh logic clear! 🎲

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch 2026-02-11-c0xd

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ChuxiJ merged commit 2484594 into main Feb 11, 2026
1 check was pending

coderabbitai bot mentioned this pull request Feb 11, 2026

Critical Memory Leak/Hang After Loading LoRA in New Update (Tier 6a) Body: #457

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: optimize audio encoding logic with caching enhancements#446

refactor: optimize audio encoding logic with caching enhancements#446
ChuxiJ merged 1 commit intomainfrom
2026-02-11-c0xd

ChuxiJ commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChuxiJ commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChuxiJ commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading