Skip to content

Conversation

@ChuxiJ
Copy link
Contributor

@ChuxiJ ChuxiJ commented Feb 11, 2026

  • Further optimized the caching mechanism for VAE audio encoding, ensuring efficient reuse of encoded latents.
  • Refined logging to provide better insights into the caching process and latency reuse.

Files changed:

  • acestep/handler.py: updated audio encoding logic to enhance caching efficiency

Summary by CodeRabbit

  • New Features

    • Added normalization controls to the generation pipeline
    • Improved audio format awareness and handling in generation metadata
  • Changes

    • Default audio output format changed from MP3 to FLAC
    • Audio format dropdown reordered with FLAC as the primary option
  • Improvements

    • Enhanced audio duration detection with improved fallback handling

- Further optimized the caching mechanism for VAE audio encoding, ensuring efficient reuse of encoded latents.
- Refined logging to provide better insights into the caching process and latency reuse.

Files changed:
  - acestep/handler.py: updated audio encoding logic to enhance caching efficiency
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The PR updates the default audio format from MP3 to FLAC across the generation pipeline (API routes, handlers, UI), propagates audio_format parameters through handler functions to build generation information with format awareness, replaces audio duration extraction to prioritize torchcodec, and adds enable_normalization and normalization_db parameters to the generation wrapper.

Changes

Cohort / File(s) Summary
Audio Format Configuration Defaults
acestep/gradio_ui/api_routes.py, acestep/gradio_ui/events/generation_handlers.py, acestep/gradio_ui/interfaces/generation.py
Changed default audio_format from "mp3" to "flac" in GenerationConfig, load_metadata, and the UI dropdown. Reordered dropdown choices to ["flac", "mp3"].
Generation Results and Metadata
acestep/gradio_ui/events/results_handlers.py
Added clear_audio_outputs_for_new_generation helper function, extended _build_generation_info with audio_format parameter, reworked generation timing output to include per-song metrics, propagated audio_format through generate_with_progress, updated send_audio_to_src_with_metadata return tuple to include audio_uploads_accordion, and switched default audio_format to "flac" for batch generation.
Audio Duration Extraction
acestep/training/dataset_builder_modules/audio_io.py
Replaced primary audio duration extraction from torchaudio.info to torchcodec AudioDecoder, with fallback to soundfile on failure. Adjusted logging from warning to debug for primary path failure.
Generation Event Wrapper
acestep/gradio_ui/events/__init__.py
Added enable_normalization and normalization_db as new inputs/outputs in the call to generate_with_batch_management.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Suggested reviewers

  • ChuxiJ

Poem

🐰 Flac and forth through generation flows,
Audio formats all aglow,
From MP3's reign to FLAC's new call,
Torchcodec decodes them all!
Normalization hops along the way, 🎵

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title mentions 'improve audio encoding logic with enhanced caching' but the actual changes focus on switching audio codec from torchaudio to torchcodec, changing default audio format from mp3 to flac, and reordering UI options. The caching optimization mentioned in the title is not reflected in the provided file summaries. Update the title to accurately reflect the main changes, such as 'refactor: switch audio codec to torchcodec and update default audio format to flac' to better represent the actual modifications.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch flac-codec

Comment @coderabbitai help to get the list of available commands and usage tips.

…s only within the system temporary directory. Update create_app to explicitly validate audio paths from JSON input. Refactor audio_utils to improve code clarity and maintainability.
@ChuxiJ ChuxiJ merged commit 53a60ba into main Feb 11, 2026
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant