-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fallback API for Inference #4099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
chenghao-mou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Not sure if this is discussed before: there is a side effect on the logs/metrics part—if a fallback is chosen inside gateway, we will still report metrics/logs under _opts.model.
|
Yeah, that's interesting i wonder if we should find a way to report back the model used by Inference. Also, @chenghao-mou can you coordinate with me before merging and releasing? |
|
Also, I just pushed a small change to address 3.9 type checks, hopefully. |
Co-authored-by: Théo Monnom <[email protected]>
d5f1c8f to
adb0c19
Compare
theomonnom
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise lgtm! Nice work!
| name: Required[str] | ||
| """Model name (e.g. "cartesia/sonic", "elevenlabs/eleven_flash_v2", "rime/arcana").""" | ||
|
|
||
| voice: Required[str | None] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the voice be None? Is this different than omitting it (removing the Required flag)
| class Fallback(TypedDict, total=False): | ||
| """Configuration for fallback models when the primary model fails.""" | ||
|
|
||
| models: Required[Sequence[FallbackModelType]] | ||
| """Fallback models in priority order.""" | ||
|
|
||
| connection: ConnectionOptions | ||
| """Connection options for fallback attempts.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this class is really useful.
Maybe the argument inside the TTS/STT can directly accept a Sequence of FallbackModelType and the ConnectionOptions can directly be inside the constructor?
Probably nitpicking but maybe we can also re-use
| class APIConnectOptions: |
I'm not 100% sure, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, we already have conn_options as a parameter in stream or _recognize_impl, but it is often used with the default value somewhere behind the scenes (not exposed to the user). I think it is a good idea to expose this at the constructor level for all of them (we can start with inference in this PR) with the existing APIConnectOptions.
That would be nice-to-have, and we can do that in a later PR. And yes, I should be able to do that. |
No description provided.