Skip to content

refactor: optimize audio encoding logic with caching enhancements#446

Merged
ChuxiJ merged 1 commit intomainfrom
2026-02-11-c0xd
Feb 11, 2026
Merged

refactor: optimize audio encoding logic with caching enhancements#446
ChuxiJ merged 1 commit intomainfrom
2026-02-11-c0xd

Conversation

@ChuxiJ
Copy link
Contributor

@ChuxiJ ChuxiJ commented Feb 11, 2026

  • Improved efficiency by implementing a caching mechanism for VAE audio encoding, allowing reuse of encoded latents for identical audio inputs.
  • Introduced a reference audio cache to prevent unnecessary re-encoding for shared reference audio across batch items.
  • Updated logging to provide clarity on the reuse of cached latents.

Files changed:

  • acestep/handler.py: modified audio encoding logic to incorporate caching

Summary by CodeRabbit

  • New Features

    • Added GPU tier override dropdown to manually select optimization settings (offloading, quantization, backend).
    • Added GPU device information display showing VRAM and current tier.
    • Expanded backend support from restricted to all available options; vLLM now recommended across tiers.
  • Bug Fixes

    • Improved VRAM-based batch size calculations with refined memory estimation.
    • Simplified vLLM initialization logic for better reliability.
  • Chores

    • Added internationalization support for GPU tier features in English, Hebrew, Japanese, and Chinese.

- Improved efficiency by implementing a caching mechanism for VAE audio encoding, allowing reuse of encoded latents for identical audio inputs.
- Introduced a reference audio cache to prevent unnecessary re-encoding for shared reference audio across batch items.
- Updated logging to provide clarity on the reuse of cached latents.

Files changed:
  - acestep/handler.py: modified audio encoding logic to incorporate caching
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The PR extends GPU tier configuration with new public APIs and UI-friendly metadata (labels, choices, device name retrieval), adds a Gradio tier override dropdown with associated event handlers, updates backend defaults across tiers from restrictive to permissive with vllm as recommended, introduces MPS-specific overrides, refines VRAM gating logic, and adds internationalization strings supporting the new UI.

Changes

Cohort / File(s) Summary
GPU Tier Configuration
acestep/gpu_config.py
Widened lm_backend_restriction from pt_mlx_only to all across tiers; promoted vllm as recommended_backend. Added public APIs: get_gpu_device_name(), get_gpu_config_for_tier(tier), and metadata constants GPU_TIER_LABELS, GPU_TIER_CHOICES. Introduced MPS overrides for macOS when tier config is retrieved.
UI Event Handling
acestep/gradio_ui/events/__init__.py, acestep/gradio_ui/events/generation_handlers.py
Introduced on_tier_change() handler that responds to tier dropdown changes, updates global GPU config, recalculates available backends and models, and returns Gradio updates for backend, model, batch size, duration, and GPU info. Enhanced update_audio_components_visibility() with defensive None/invalid batch_size handling.
UI Components
acestep/gradio_ui/interfaces/generation.py
Added gpu_info_display Markdown row showing device name, VRAM, and tier label, and tier_dropdown selector initialized to current tier with GPU_TIER_LABELS as labels. Extended imports from gpu_config module.
Internationalization
acestep/gradio_ui/i18n/en.json, acestep/gradio_ui/i18n/he.json, acestep/gradio_ui/i18n/ja.json, acestep/gradio_ui/i18n/zh.json
Added three new service translation keys across all locale files: gpu_auto_tier, tier_label, and tier_info, with localized values for each language. Minor trailing comma adjustments in existing keys.
Memory Management
acestep/handler.py, acestep/llm_inference.py
Updated VRAM batch reduction to use piecewise formula (0.5 + max(0, 0.15 × duration_delta_min)) instead of linear estimate. Replaced total VRAM threshold (VRAM_SAFE_TOTAL_GB) with free VRAM check (VRAM_SAFE_FREE_GB = 2.0), simplifying vLLM gating logic.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Dropdown as Tier Dropdown
    participant Handler as on_tier_change Handler
    participant Config as GPU Config
    participant Backend as Backend Selector
    participant UI as Gradio Components

    User->>Dropdown: Selects tier
    Dropdown->>Handler: Trigger tier change event
    Handler->>Config: get_gpu_config_for_tier(tier)
    Config->>Config: Apply MPS overrides if on macOS
    Config->>Handler: Return GPUConfig
    Handler->>Handler: set_global_gpu_config()
    Handler->>Backend: Compute available backends<br/>& LM models
    Backend->>Handler: Return choices & default model
    Handler->>UI: Update backend dropdown<br/>Update model selector<br/>Update batch size<br/>Update duration<br/>Update GPU info
    UI->>User: Display reconfigured UI
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 A dropdown hops in, so shiny and new,
GPU tiers now pick what's best just for you!
vLLM bounces brightly across every tier,
MPS hops along without any fear,
Memory guards watch with fresh logic clear! 🎲

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 2026-02-11-c0xd

Comment @coderabbitai help to get the list of available commands and usage tips.

@ChuxiJ ChuxiJ merged commit 2484594 into main Feb 11, 2026
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant