Skip to content

feat: Realtime API support reboot #5392

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 25, 2025
Merged

Conversation

richiejp
Copy link
Collaborator

@richiejp richiejp commented May 19, 2025

Rebase and continue #3722 with the intention of at least getting audio-to-text with VAD working.

  • feat(realtime): Initial Realtime API implementation
  • chore: go mod tidy

EDIT:

This just implements transcription only mode with VAD enabled. However a lot of code for supporting the full API is still there, but it's not functional, it's there to be built on.

I have only tested against richiejp/VoxInput#2
Which works nicely, but the API behavior may not be exactly like OpenAI's. It would be helpful for people to test their apps and report the results.

I think it would be good to get transcription only mode out for experimentation and get a version of VoxInput out which uses it. Then I can start thinking of ways to use the full API with VoxInput or something else. The full API could be used with tool calling to enable flexible voice commands, either on desktop or with embedded devices.

Copy link

netlify bot commented May 19, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 269a8eb
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/6831aa4ac88e630008005b90
😎 Deploy Preview https://deploy-preview-5392--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

mudler and others added 4 commits May 24, 2025 12:15
Signed-off-by: Richard Palethorpe <[email protected]>
Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.

Signed-off-by: Richard Palethorpe <[email protected]>
@richiejp richiejp marked this pull request as ready for review May 24, 2025 11:15
Copy link
Owner

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you so much to pick this up!

@mudler mudler merged commit bf6426a into mudler:master May 25, 2025
27 checks passed
@mudler
Copy link
Owner

mudler commented May 26, 2025

Partly related to #3714 and #191

@mudler mudler mentioned this pull request May 27, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants