-
Notifications
You must be signed in to change notification settings - Fork 975
feat(ai): Add prefer_in_cloud option for inference mode #9236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces a new InferenceMode option, prefer_in_cloud. When this mode is selected, the SDK will attempt to use the cloud backend first. If the cloud call fails with a network-related error, it will fall back to the on-device model if available.
🦋 Changeset detectedLatest commit: 337d34c The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Size Report 1Affected Products
Test Logs |
Size Analysis Report 1Affected Products
Test Logs |
This commit adds a new test suite to verify that the GenerativeModel's methods correctly dispatch requests to either the on-device or cloud backends based on the selected InferenceMode. It covers generateContent, generateContentStream, and countTokens.
Co-authored-by: Daniel La Rocque <[email protected]>
…rebase-js-sdk into feat/prefer-in-cloud
Co-authored-by: Daniel La Rocque <[email protected]>
…rebase-js-sdk into feat/prefer-in-cloud
docs-devsite/ai.md
Outdated
@@ -624,13 +624,16 @@ ImagenSafetyFilterLevel: { | |||
|
|||
<b>(EXPERIMENTAL)</b> Determines whether inference happens on-device or in-cloud. | |||
|
|||
<b>PREFER\_ON\_DEVICE:</b> Attempt to make inference calls on-device. If on-device inference is not available, it will fall back to cloud. <br/> <b>ONLY\_ON\_DEVICE:</b> Only attempt to make inference calls on-device. It will not fall back to cloud. If on-device inference is not available, inference methods will throw. <br/> <b>ONLY\_IN\_CLOUD:</b> Only attempt to make inference calls to the cloud. It will not fall back to on-device. <br/> <b>PREFER\_IN\_CLOUD:</b> Attempt to make inference calls to the cloud. If not available, it will fall back to on-device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<b>PREFER\_ON\_DEVICE:</b> Attempt to make inference calls on-device. If on-device inference is not available, it will fall back to cloud. <br/> <b>ONLY\_ON\_DEVICE:</b> Only attempt to make inference calls on-device. It will not fall back to cloud. If on-device inference is not available, inference methods will throw. <br/> <b>ONLY\_IN\_CLOUD:</b> Only attempt to make inference calls to the cloud. It will not fall back to on-device. <br/> <b>PREFER\_IN\_CLOUD:</b> Attempt to make inference calls to the cloud. If not available, it will fall back to on-device. | |
<b>PREFER\_ON\_DEVICE:</b> Attempt to make inference calls using an on-device model. If on-device inference is not available, the SDK will fall back to using a cloud-hosted model. <br/> <b>ONLY\_ON\_DEVICE:</b> Only attempt to make inference calls using an on-device model. The SDK will not fall back to a cloud-hosted model. If on-device inference is not available, inference methods will throw. <br/> <b>ONLY\_IN\_CLOUD:</b> Only attempt to make inference calls using a cloud-hosted model. The SDK will not fall back to an on-device model. <br/> <b>PREFER\_IN\_CLOUD:</b> Attempt to make inference calls to a cloud-hosted model. If not available, the SDK will fall back to an on-device model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in source code comment and re-generated.
Co-authored-by: rachelsaunders <[email protected]>
…rebase-js-sdk into feat/prefer-in-cloud
Add prefer_in_cloud option - opposite of prefer_on_device.
Note: Used gemini-cli
Had it add tests to generative-model.test.ts instead of on generateContent/generateContentStream/countTokens for all 4
InferenceMode
cases because I think the more of the pipeline it tests, the better it is - a lot of errors come from omitting or incorrectly passing params through multiple functions.Had it extract the logic for on-device vs cloud into a helper function in helpers.ts even though it is only used in generate-content.ts for now, because we'll need to re-use the same logic for countTokens when we implement that (Chrome API should be ready now).
Docs review: Only need to review
docs-devsite/ai.md
.