Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
687840d
feat(audio): extend TTSProviderId/ASRProviderId to support custom pro…
wyuc Apr 11, 2026
2626785
feat(audio): update provider registry helpers for custom provider sup…
wyuc Apr 11, 2026
3c0b4e7
feat(audio): route custom providers to OpenAI-compatible implementations
wyuc Apr 11, 2026
46a0cf4
feat(audio): add custom TTS/ASR provider CRUD to settings store
wyuc Apr 11, 2026
91e94b4
feat(audio): add dialog for creating custom audio providers
wyuc Apr 11, 2026
99caa0e
feat(audio): add custom provider UI with voice management and delete
wyuc Apr 11, 2026
70db8ff
feat(i18n): add custom audio provider translation strings
wyuc Apr 11, 2026
aab12f8
fix(audio): resolve remaining type errors for custom provider support
wyuc Apr 11, 2026
fd53ac2
fix(audio): allow custom provider IDs to reach switch default branch
wyuc Apr 11, 2026
35f4145
feat(audio): add default model field to dialog and polish voice list UI
wyuc Apr 11, 2026
9ef827c
fix(audio): fallback to customDefaultBaseUrl when testing custom TTS …
wyuc Apr 11, 2026
e81525c
fix(audio): auto-select first voice when adding to custom TTS provider
wyuc Apr 11, 2026
93b1759
fix(audio): fallback to customDefaultBaseUrl in all TTS call sites
wyuc Apr 11, 2026
a770303
fix(audio): include custom provider voices in agent voice picker
wyuc Apr 11, 2026
9f87c2b
fix(audio): fallback to customDefaultBaseUrl in all ASR call sites
wyuc Apr 11, 2026
6edb356
feat(audio): add model list management for custom ASR providers
wyuc Apr 12, 2026
a3f3b67
fix(audio): seed customModels from defaultModel and fix ASR model label
wyuc Apr 12, 2026
7224c3e
fix(audio): separate ASR model management from creation dialog
wyuc Apr 12, 2026
db8739b
fix(audio): simplify ASR model management — first model is default
wyuc Apr 12, 2026
773e81a
fix(audio): add loading state and detailed errors to ASR test
wyuc Apr 12, 2026
cb5e5a0
fix(i18n): use correct key for processing label in ASR test button
wyuc Apr 12, 2026
53b5dba
fix(audio): polish custom provider UX and type safety
wyuc Apr 12, 2026
2834e3a
fix(test): add isCustomTTSProvider/isCustomASRProvider to audio types…
wyuc Apr 12, 2026
30b35bc
fix(audio): add customDefaultBaseUrl fallback in all TTS/ASR call sites
wyuc Apr 12, 2026
773a180
style: format generation-preview/page.tsx
wyuc Apr 12, 2026
403952f
fix: remove unused ttsSpeedRange variable in media-popover
wyuc Apr 12, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion app/generation-preview/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -728,7 +728,10 @@ function GenerationPreviewContent() {
ttsVoice: settings.ttsVoice,
ttsSpeed: settings.ttsSpeed,
ttsApiKey: ttsProviderConfig?.apiKey || undefined,
ttsBaseUrl: ttsProviderConfig?.baseUrl || undefined,
ttsBaseUrl:
ttsProviderConfig?.baseUrl ||
ttsProviderConfig?.customDefaultBaseUrl ||
undefined,
}),
signal,
});
Expand Down
10 changes: 8 additions & 2 deletions components/agent/agent-bar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,10 @@ function AgentVoicePill({
ttsVoice: voiceId,
ttsSpeed: 1,
ttsApiKey: providerConfig?.apiKey,
ttsBaseUrl: providerConfig?.serverBaseUrl || providerConfig?.baseUrl,
ttsBaseUrl:
providerConfig?.serverBaseUrl ||
providerConfig?.baseUrl ||
providerConfig?.customDefaultBaseUrl,
}),
signal: controller.signal,
});
Expand Down Expand Up @@ -337,7 +340,10 @@ function TeacherVoicePill({
ttsVoice: voiceId,
ttsSpeed: 1,
ttsApiKey: providerConfig?.apiKey,
ttsBaseUrl: providerConfig?.serverBaseUrl || providerConfig?.baseUrl,
ttsBaseUrl:
providerConfig?.serverBaseUrl ||
providerConfig?.baseUrl ||
providerConfig?.customDefaultBaseUrl,
}),
signal: controller.signal,
});
Expand Down
2 changes: 1 addition & 1 deletion components/audio/tts-config-popover.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ export function TtsConfigPopover() {
voice: ttsVoice,
speed: ttsSpeed,
apiKey: providerConfig?.apiKey,
baseUrl: providerConfig?.baseUrl,
baseUrl: providerConfig?.baseUrl || providerConfig?.customDefaultBaseUrl,
});
} catch (error) {
const message =
Expand Down
57 changes: 37 additions & 20 deletions components/generation/media-popover.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
Mic,
SlidersHorizontal,
ChevronRight,
Play,

Check warning on line 12 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Play' is defined but never used. Allowed unused vars must match /^_/u
Loader2,

Check warning on line 13 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Loader2' is defined but never used. Allowed unused vars must match /^_/u
} from 'lucide-react';
import { toast } from 'sonner';
import { Popover, PopoverContent, PopoverTrigger } from '@/components/ui/popover';
Expand All @@ -24,7 +24,7 @@
SelectTrigger,
SelectValue,
} from '@/components/ui/select';
import { Slider } from '@/components/ui/slider';

Check warning on line 27 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'Slider' is defined but never used. Allowed unused vars must match /^_/u
import { Switch } from '@/components/ui/switch';
import { cn } from '@/lib/utils';
import { useI18n } from '@/lib/hooks/use-i18n';
Expand All @@ -32,10 +32,11 @@
import { useTTSPreview } from '@/lib/audio/use-tts-preview';
import { IMAGE_PROVIDERS } from '@/lib/media/image-providers';
import { VIDEO_PROVIDERS } from '@/lib/media/video-providers';
import { TTS_PROVIDERS, getTTSVoices } from '@/lib/audio/constants';
import { TTS_PROVIDERS, getTTSVoices, CUSTOM_ASR_DEFAULT_LANGUAGES } from '@/lib/audio/constants';
import { ASR_PROVIDERS, getASRSupportedLanguages } from '@/lib/audio/constants';
import type { ImageProviderId, VideoProviderId } from '@/lib/media/types';
import type { TTSProviderId, ASRProviderId } from '@/lib/audio/types';
import { isCustomASRProvider } from '@/lib/audio/types';
import type { SettingsSection } from '@/lib/types/settings';

interface MediaPopoverProps {
Expand Down Expand Up @@ -137,9 +138,9 @@
const ttsVoice = useSettingsStore((s) => s.ttsVoice);
const ttsSpeed = useSettingsStore((s) => s.ttsSpeed);
const ttsProvidersConfig = useSettingsStore((s) => s.ttsProvidersConfig);
const setTTSProvider = useSettingsStore((s) => s.setTTSProvider);

Check warning on line 141 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSProvider' is assigned a value but never used. Allowed unused vars must match /^_/u
const setTTSVoice = useSettingsStore((s) => s.setTTSVoice);

Check warning on line 142 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSVoice' is assigned a value but never used. Allowed unused vars must match /^_/u
const setTTSSpeed = useSettingsStore((s) => s.setTTSSpeed);

Check warning on line 143 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'setTTSSpeed' is assigned a value but never used. Allowed unused vars must match /^_/u

const asrProviderId = useSettingsStore((s) => s.asrProviderId);
const asrLanguage = useSettingsStore((s) => s.asrLanguage);
Expand Down Expand Up @@ -167,8 +168,6 @@
needsKey: boolean,
) => !needsKey || !!configs[id]?.apiKey || !!configs[id]?.isServerConfigured;

const ttsSpeedRange = TTS_PROVIDERS[ttsProviderId]?.speedRange;

// ─── Dynamic browser voices ───
const [browserVoices, setBrowserVoices] = useState<SpeechSynthesisVoice[]>([]);
useEffect(() => {
Expand Down Expand Up @@ -216,7 +215,7 @@

// TTS: grouped by provider, voices as items (matching Image/Video pattern)
// Browser-native voices are split into sub-groups by language.
const ttsGroups = useMemo(() => {

Check warning on line 218 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'ttsGroups' is assigned a value but never used. Allowed unused vars must match /^_/u
const groups: SelectGroupData[] = [];

for (const p of Object.values(TTS_PROVIDERS)) {
Expand Down Expand Up @@ -261,7 +260,7 @@
}, [ttsProvidersConfig, locale, browserVoices, t]);

// TTS preview
const handlePreview = useCallback(async () => {

Check warning on line 263 in components/generation/media-popover.tsx

View workflow job for this annotation

GitHub Actions / Lint, Typecheck & Unit Tests

'handlePreview' is assigned a value but never used. Allowed unused vars must match /^_/u
if (previewing) {
stopPreview();
return;
Expand All @@ -275,7 +274,7 @@
voice: ttsVoice,
speed: ttsSpeed,
apiKey: providerConfig?.apiKey,
baseUrl: providerConfig?.baseUrl,
baseUrl: providerConfig?.baseUrl || providerConfig?.customDefaultBaseUrl,
});
} catch (error) {
const message =
Expand All @@ -293,23 +292,41 @@
ttsVoice,
]);

// ASR: only available providers
const asrGroups = useMemo(
() =>
Object.values(ASR_PROVIDERS)
.filter((p) => cfgOk(asrProvidersConfig, p.id, p.requiresApiKey))
.map((p) => ({
groupId: p.id,
groupName: p.name,
groupIcon: p.icon,
available: true,
items: getASRSupportedLanguages(p.id).map((l) => ({
id: l,
name: l,
})),
// ASR: built-in + custom providers
const asrGroups = useMemo(() => {
const groups: SelectGroupData[] = [];

// Built-in providers
for (const p of Object.values(ASR_PROVIDERS)) {
if (!cfgOk(asrProvidersConfig, p.id, p.requiresApiKey)) continue;
groups.push({
groupId: p.id,
groupName: p.name,
groupIcon: p.icon,
available: true,
items: getASRSupportedLanguages(p.id).map((l) => ({
id: l,
name: l,
})),
[asrProvidersConfig],
);
});
}

// Custom providers — only show if at least one model is configured
for (const [id, cfg] of Object.entries(asrProvidersConfig)) {
if (!isCustomASRProvider(id)) continue;
const customModels = cfg.customModels || [];
if (customModels.length === 0) continue;
const providerName = cfg.customName || id;
groups.push({
groupId: id,
groupName: providerName,
available: true,
items: CUSTOM_ASR_DEFAULT_LANGUAGES.map((l) => ({ id: l, name: l })),
});
}

return groups;
}, [asrProvidersConfig]);

// Auto-select first enabled tab on open
const handleOpenChange = (isOpen: boolean) => {
Expand Down
141 changes: 141 additions & 0 deletions components/settings/add-audio-provider-dialog.tsx
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
'use client';

import { useState } from 'react';
import { Dialog, DialogContent, DialogTitle, DialogDescription } from '@/components/ui/dialog';
import { Button } from '@/components/ui/button';
import { Input } from '@/components/ui/input';
import { Label } from '@/components/ui/label';
import { Checkbox } from '@/components/ui/checkbox';
import { Plus } from 'lucide-react';
import { useI18n } from '@/lib/hooks/use-i18n';

export interface NewAudioProviderData {
name: string;
baseUrl: string;
defaultModel: string;
requiresApiKey: boolean;
}

interface AddAudioProviderDialogProps {
open: boolean;
onOpenChange: (open: boolean) => void;
onAdd: (data: NewAudioProviderData) => void;
type: 'tts' | 'asr';
}

export function AddAudioProviderDialog({
open,
onOpenChange,
onAdd,
type,
}: AddAudioProviderDialogProps) {
const { t } = useI18n();

const [name, setName] = useState('');
const [baseUrl, setBaseUrl] = useState('');
const [defaultModel, setDefaultModel] = useState('');
const [requiresApiKey, setRequiresApiKey] = useState(false);

// Reset form when dialog closes
const [prevOpen, setPrevOpen] = useState(open);
if (open !== prevOpen) {
setPrevOpen(open);
if (!open) {
setName('');
setBaseUrl('');
setDefaultModel('');
setRequiresApiKey(false);
}
}

const handleAdd = () => {
if (!name.trim() || !baseUrl.trim()) return;
onAdd({
name: name.trim(),
baseUrl: baseUrl.trim(),
defaultModel: defaultModel.trim(),
requiresApiKey,
});
onOpenChange(false);
};

const titleKey =
type === 'tts' ? 'settings.addCustomTTSProvider' : 'settings.addCustomASRProvider';

return (
<Dialog open={open} onOpenChange={onOpenChange}>
<DialogContent className="sm:max-w-[450px]">
<DialogTitle className="sr-only">{t(titleKey)}</DialogTitle>
<DialogDescription className="sr-only">
{t('settings.addCustomAudioProviderDescription')}
</DialogDescription>
<div className="space-y-4">
<div className="pb-3 border-b">
<h2 className="text-lg font-semibold">{t(titleKey)}</h2>
<p className="text-xs text-muted-foreground mt-1">
{t('settings.addCustomAudioProviderDescription')}
</p>
</div>

<div className="space-y-2">
<Label>{t('settings.providerName')}</Label>
<Input
placeholder={type === 'tts' ? 'My Local TTS' : 'My Local ASR'}
value={name}
onChange={(e) => setName(e.target.value)}
/>
</div>

<div className="space-y-2">
<Label>{t('settings.defaultBaseUrl')}</Label>
<Input
type="url"
placeholder="http://localhost:8000/v1"
value={baseUrl}
onChange={(e) => setBaseUrl(e.target.value)}
/>
</div>

{/* Default Model — TTS only (ASR models are managed in provider settings) */}
{type === 'tts' && (
<div className="space-y-2">
<Label>{t('settings.defaultModel')}</Label>
<Input
placeholder="tts-1"
value={defaultModel}
onChange={(e) => setDefaultModel(e.target.value)}
/>
<p className="text-xs text-muted-foreground">{t('settings.defaultModelHint')}</p>
</div>
)}

<div className="flex items-center space-x-2">
<Checkbox
id="audio-requires-api-key"
checked={requiresApiKey}
onCheckedChange={(checked) => setRequiresApiKey(checked as boolean)}
/>
<label htmlFor="audio-requires-api-key" className="text-sm cursor-pointer">
{t('settings.requiresApiKey')}
</label>
</div>

<div className="flex items-center justify-end gap-2 pt-3 border-t">
<Button variant="outline" size="sm" onClick={() => onOpenChange(false)}>
{t('settings.cancelEdit')}
</Button>
<Button
size="sm"
onClick={handleAdd}
disabled={!name.trim() || !baseUrl.trim()}
className="gap-1.5"
>
<Plus className="h-3.5 w-3.5" />
{t('settings.addProviderButton')}
</Button>
</div>
</div>
</DialogContent>
</Dialog>
);
}
Loading
Loading