-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't Fit a Binary Classifier that Uses Gemma Pre-trained Model Embeddings #2102
Comments
The reason I think I've isolated a problem in the Gemma encoding layer is that training the classifier model works fine if I swap a |
I think the problem is related to instantiating a sub-model (
This unblocks training locally for me but I am seeing a TPU error when trying the fix in your Colab, unsure if related. |
Thanks, @jeffcarp. The change did get past the original error. Maybe it's the TPU error to which you're referring?
|
I'm able to get training running on GPU, but the Colab instance doesn't have enough memory to load the full Gemma preset: The TPU error looks like it's related to a TF version mismatch: |
Is there a magic combination of package versions we can use to get it to run on the TPU? |
Thanks, @Gopi-Uppari. Using
Now I'll see what happens when I deploy the binary classifer to a TensorFlow serving endpoint. |
Hi @rlcauvin, If you run into any issues during deployment, just let us know. If the issue is resolved for you. Please feel free to close the issue. Thank you. |
@Gopi-Uppari Exporting the model for TensorFlow serving fails:
Error message:
I have updated the Colab to show the attempted export and the error. |
Hi @rlcauvin, I reproduced the issue, the error suggests that TensorFlow is struggling to track a SentencePieceOp resource, likely coming from the GemmaEncoder layer. Since this layer uses a SentencePiece tokenizer, TensorFlow is unable to properly export it. There iss also a warning about input structure mismatch. To fix the issue, try using the provided code that correctly extracts and tracks the SentencePiece model, while ensuring the input format aligns with the model's expectations. Could you take a look at this gist file for further reference. Thank you. |
@Gopi-Uppari Thanks! With your provided code for defining the function and signature and using In any case, I deployed it to TensorFlow serving, and when I invoked the endpoint, the following error occurred:
Why would the variable |
Describe the bug
Using the "gemma2_2b_en" Gemma pre-trained model in a neural network results in
ValueError: Cannot get result() since the metric has not yet been built.
during training.To Reproduce
Stripped down example here: https://colab.research.google.com/drive/1r8XkaQBeUxP5fp9i1QLaikFIdbhcrKMw?usp=sharing
Expected behavior
It should be possible to use a Gemma pre-trained model as a neural network layer in a binary classifier and successfully train the model.
Additional context
This use of Gemma to generate embeddings for binary classification is based on this starting point by @jeffcarp.
Would you like to help us fix it?
No
The text was updated successfully, but these errors were encountered: