Skip to content

Conversation

Copy link

Copilot AI commented Dec 4, 2025

Adds UI for adjusting voice synthesis parameters (accent, pitch, speed, intonation) using VOICEVOX AudioQuery API. Previously, users could only generate audio directly from text without parameter control.

Architecture

Models

  • AudioQuery, AccentPhrase, Mora - Strongly-typed models matching VOICEVOX API schema with JSON serialization

ViewModels

  • AccentPhraseViewModel, MoraViewModel - Reactive wrappers with IDisposable for subscription management
  • Enhanced TtsTabViewModel with AudioQuery generation/synthesis methods

UI

  • Global controls: speed (0.5-2.0), pitch (-0.15-0.15), intonation (0-2.0), volume, silence padding
  • Per-phrase: accent position, interrogative flag
  • Per-mora: interactive drag-based pitch curve editor (0-200Hz) - Visual pitch curve with draggable points for intuitive pitch adjustment

Drag-Based Pitch Editor

The pitch curve editor (PitchCurveEditor.cs) provides an intuitive visual interface:

  • Visual pitch curve showing all moras in context with connecting lines
  • Draggable points - Click and drag any point vertically to adjust pitch
  • Real-time feedback - Curve updates instantly as you drag
  • Grid reference lines for easy alignment (0-200Hz range)
  • Mora labels and pitch values displayed for each point

Users can now adjust pitch naturally by dragging points up (higher pitch) or down (lower pitch), making it much easier to create natural-sounding intonation patterns compared to traditional numeric input controls.

Workflow

// Generate AudioQuery from text
public Task GenerateAudioQuery()
{
    var result = synthesizer.CreateAudioQuery(text, styleId, options, out var json);
    var query = JsonSerializer.Deserialize<AudioQuery>(json);
    // Populate UI with editable parameters
}

// Synthesize with modified AudioQuery
private Task<byte[]?> Tts()
{
    if (CurrentAudioQuery.Value != null)
    {
        var json = JsonSerializer.Serialize(CurrentAudioQuery.Value);
        return synthesizer.Synthesis(json, styleId, options, out var wav);
    }
    // Fallback to direct TTS
    return synthesizer.Tts(text, styleId, options, out var wav);
}

Resource Management

ViewModels implement IDisposable with subscription tracking to prevent memory leaks:

private readonly List<IDisposable> _disposables = new();
_disposables.Add(Pitch.Subscribe(value => Model.Pitch = value));

Backward compatible - works with or without AudioQuery generation.

Original prompt

生成する音声のアクセントなどを調整するUIを実装してください。
流れは以下の通りです。

  1. ユーザーがテキスト入力
  2. AudioQuery生成
  3. AudioQuery解析
  4. UIに表示
  5. ユーザーがアクセントやピッチといったパラメータを調整
  6. AudioQueryに反映
  7. 音声を生成

APIは簡易的なC# Bindingしかないので、以下のREST APIのドキュメントからAudioQueryなどのスキーマを参照してください。


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI changed the title [WIP] Implement UI for adjusting audio query parameters Implement AudioQuery UI for voice parameter adjustment Dec 4, 2025
Copilot AI requested a review from yuto-trd December 4, 2025 04:49
Copy link
Member

@yuto-trd yuto-trd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

マウスのドラッグ操作で編集できるようにしてください。

Copy link
Author

Copilot AI commented Dec 4, 2025

マウスのドラッグ操作で編集できるようにしてください。

ドラッグ操作でピッチを編集できるようにしました。コミット: e1bb35d

新機能:

  • ビジュアルなピッチカーブエディタ
  • ●ポイントをドラッグして上下に動かすとピッチが変更されます(0-200Hz)
  • リアルタイムでカーブが更新されます
  • グリッド線とラベルで見やすい表示

使い方:

  1. ●ポイントをクリック
  2. 上にドラッグ → ピッチが高くなる
  3. 下にドラッグ → ピッチが低くなる
  4. マウスを離すと確定
ピッチカーブ(ドラッグで編集):
200Hz ┌─────────────────────────┐
      │        ●130             │
150Hz │  ●120  │   ●100         │
      │ /  \   │  /   \         │
100Hz │/    \ │ /     \        │
      ●110   ●95●140   ●90     │
  0Hz └─────────────────────────┘
      こ ん に ち は せ か

詳細: PITCH_CURVE_EDITOR.md

Copilot AI changed the title Implement AudioQuery UI for voice parameter adjustment Implement AudioQuery UI for voice parameter adjustment with drag-based pitch editor Dec 4, 2025
Copilot AI requested a review from yuto-trd December 4, 2025 05:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants