Skip to content

Add sound generation support#222

Open
saarnilauri wants to merge 3 commits intoWordPress:trunkfrom
saarnilauri:feature/sound-model
Open

Add sound generation support#222
saarnilauri wants to merge 3 commits intoWordPress:trunkfrom
saarnilauri:feature/sound-model

Conversation

@saarnilauri
Copy link
Contributor

Summary

  • Introduce SoundGenerationModelInterface and SoundGenerationOperationModelInterface contracts for sound generation models
  • Add SOUND_GENERATION capability to CapabilityEnum
  • Wire sound generation into PromptBuilder and AiClient with full test coverage

Why this is needed?

Why a separate SOUND_GENERATION capability instead of reusing SPEECH_GENERATION?

Speech and sound are fundamentally different audio domains with different provider landscapes, parameters, and use cases:

  1. Different output types — Speech models produce spoken language (voice, narration, dialogue). Sound models produce non-verbal audio (sound effects, ambient noise, foley). A consumer asking for "rain on a tin roof" or "sword
    clash" should not be routed to a speech model.
  2. Different provider models — AI providers expose these as distinct model types (e.g. ElevenLabs sound effects vs. their TTS voices, or dedicated SFX models). Collapsing them into one capability would force providers to
    either advertise a capability they don't fully support or implement awkward routing logic internally.
  3. Consistent with the existing pattern — The client already distinguishes between TEXT_TO_SPEECH_CONVERSION (deterministic TTS), SPEECH_GENERATION (generative voice/dialogue), and MUSIC_GENERATION. Sound generation is a
    natural sibling — it completes the audio capability taxonomy without overlap. Merging sound into speech would break the 1:1 mapping between capability enum values and provider model interfaces that the rest of the
    architecture relies on.
  4. Clean capability discovery — Consumers can query CapabilityEnum::soundGeneration() to find providers that actually support sound effect generation, rather than getting speech models that can't fulfill the request.

Test plan

  • Unit tests added for PromptBuilder sound generation methods
  • AiClient test updated to cover the new capability
  • CapabilityEnum test updated for SOUND_GENERATION

@github-actions
Copy link

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Co-authored-by: saarnilauri <laurisaarni@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@codecov
Copy link

codecov bot commented Mar 20, 2026

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.17%. Comparing base (6317042) to head (ec0445b).

Files with missing lines Patch % Lines
src/Builders/PromptBuilder.php 95.23% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##              trunk     #222      +/-   ##
============================================
+ Coverage     88.12%   88.17%   +0.04%     
- Complexity     1213     1222       +9     
============================================
  Files            60       60              
  Lines          3934     3958      +24     
============================================
+ Hits           3467     3490      +23     
- Misses          467      468       +1     
Flag Coverage Δ
unit 88.17% <95.83%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant