-
Notifications
You must be signed in to change notification settings - Fork 140
Voice Stream AI: Audio Processing Improvements #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hey @MsMatias thanks for this PR! going through it, testing. Let me know if you can go through it otherwise I can try and find some time next week. |
Hi @alesaccoia. Thanks for the feedback, indeed, I will take a look and update the tests as well. |
|
@alesaccoia Please, let me know if you think we should add something else. |
|
thanks, wlll test this during the weekend
…On Fri, 18 Apr 2025 at 11:28, Matias Samuel Miranda < ***@***.***> wrote:
@alesaccoia <https://github.com/alesaccoia> Please, let me know if you
think we should add something else.
—
Reply to this email directly, view it on GitHub
<#39 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSEP3QJENINYCO3IZCBCL22DATNAVCNFSM6AAAAAB3L7E7XGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJVGA3TANBRGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
*MsMatias* left a comment (alesaccoia/VoiceStreamAI#39)
<#39 (comment)>
@alesaccoia <https://github.com/alesaccoia> Please, let me know if you
think we should add something else.
—
Reply to this email directly, view it on GitHub
<#39 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSEP3QJENINYCO3IZCBCL22DATNAVCNFSM6AAAAAB3L7E7XGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQMJVGA3TANBRGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This PR addresses several performance bottlenecks and architectural limitations in the current implementation:
Summary
This PR implements several significant improvements to the VoiceStreamAI speech recognition system:
Key Technical Changes
Audio Processing
convert_audio_bytes_to_numpy()for efficient conversionVoice Activity Detection
Architecture Improvements
AudioProcessingCallbacksclass for event-driven architectureConfiguration
Testing
These changes have been tested with both VAD types and verify that:
Performance Improvements
This PR addresses several issues with the original implementation and should improve overall performance and reliability of the system.