refactor: improve audio encoding logic with enhanced caching #448

ChuxiJ · 2026-02-11T12:31:01Z

Further optimized the caching mechanism for VAE audio encoding, ensuring efficient reuse of encoded latents.
Refined logging to provide better insights into the caching process and latency reuse.

Files changed:

acestep/handler.py: updated audio encoding logic to enhance caching efficiency

Summary by CodeRabbit

New Features
- Added normalization controls to the generation pipeline
- Improved audio format awareness and handling in generation metadata
Changes
- Default audio output format changed from MP3 to FLAC
- Audio format dropdown reordered with FLAC as the primary option
Improvements
- Enhanced audio duration detection with improved fallback handling

- Further optimized the caching mechanism for VAE audio encoding, ensuring efficient reuse of encoded latents. - Refined logging to provide better insights into the caching process and latency reuse. Files changed: - acestep/handler.py: updated audio encoding logic to enhance caching efficiency

coderabbitai · 2026-02-11T12:31:21Z

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The PR updates the default audio format from MP3 to FLAC across the generation pipeline (API routes, handlers, UI), propagates audio_format parameters through handler functions to build generation information with format awareness, replaces audio duration extraction to prioritize torchcodec, and adds enable_normalization and normalization_db parameters to the generation wrapper.

Changes

Cohort / File(s)	Summary
Audio Format Configuration Defaults `acestep/gradio_ui/api_routes.py`, `acestep/gradio_ui/events/generation_handlers.py`, `acestep/gradio_ui/interfaces/generation.py`	Changed default audio_format from "mp3" to "flac" in GenerationConfig, load_metadata, and the UI dropdown. Reordered dropdown choices to ["flac", "mp3"].
Generation Results and Metadata `acestep/gradio_ui/events/results_handlers.py`	Added clear_audio_outputs_for_new_generation helper function, extended _build_generation_info with audio_format parameter, reworked generation timing output to include per-song metrics, propagated audio_format through generate_with_progress, updated send_audio_to_src_with_metadata return tuple to include audio_uploads_accordion, and switched default audio_format to "flac" for batch generation.
Audio Duration Extraction `acestep/training/dataset_builder_modules/audio_io.py`	Replaced primary audio duration extraction from torchaudio.info to torchcodec AudioDecoder, with fallback to soundfile on failure. Adjusted logging from warning to debug for primary path failure.
Generation Event Wrapper `acestep/gradio_ui/events/__init__.py`	Added enable_normalization and normalization_db as new inputs/outputs in the call to generate_with_batch_management.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

feat: add fallback to get duration #411: Modifies audio duration extraction logic in training/dataset_builder_modules/audio_io.py to change fallback behavior from torchaudio to alternative decoders.
Normalize audio against clipping and 32bit wav support #406: Updates audio_format handling and defaults in generation UI and handlers, with similar propagation of audio_format parameters through the generation pipeline.

Suggested reviewers

ChuxiJ

Poem

🐰 Flac and forth through generation flows,
Audio formats all aglow,
From MP3's reign to FLAC's new call,
Torchcodec decodes them all!
Normalization hops along the way, 🎵

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title mentions 'improve audio encoding logic with enhanced caching' but the actual changes focus on switching audio codec from torchaudio to torchcodec, changing default audio format from mp3 to flac, and reordering UI options. The caching optimization mentioned in the title is not reflected in the provided file summaries.	Update the title to accurately reflect the main changes, such as 'refactor: switch audio codec to torchcodec and update default audio format to flac' to better represent the actual modifications.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch flac-codec

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…s only within the system temporary directory. Update create_app to explicitly validate audio paths from JSON input. Refactor audio_utils to improve code clarity and maintainability.

Enhance audio path validation in RequestParser to allow absolute path…

dba86f5

…s only within the system temporary directory. Update create_app to explicitly validate audio paths from JSON input. Refactor audio_utils to improve code clarity and maintainability.

ChuxiJ merged commit 53a60ba into main Feb 11, 2026
1 check was pending

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improve audio encoding logic with enhanced caching #448

refactor: improve audio encoding logic with enhanced caching #448

ChuxiJ commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

refactor: improve audio encoding logic with enhanced caching #448

refactor: improve audio encoding logic with enhanced caching #448

Conversation

ChuxiJ commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChuxiJ commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading