Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Real-Time Speech-to-Text Translation Support #58

Open
hu-ke opened this issue Jan 24, 2025 · 0 comments
Open

Real-Time Speech-to-Text Translation Support #58

hu-ke opened this issue Jan 24, 2025 · 0 comments

Comments

@hu-ke
Copy link

hu-ke commented Jan 24, 2025

Description of the feature request:

Instead of waiting for a turn of speech to complete (VAD mode), would it be possible to stream the generated results in real-time?

What problem are you trying to solve with this feature?

Suppose I am currently in a Japanese interview, but my Japanese skills are not very strong. I would like to build a app with the Gemini Multimodal API to assist me with real-time speech-to-text translation.

Any other information you'd like to share?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant