Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live - "client-content" without end of turn blocks voice response. #682

Open
tiagoefreitas opened this issue Feb 16, 2025 · 4 comments
Open
Assignees
Labels
component:api Issues related to the API, not the SDK. status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement

Comments

@tiagoefreitas
Copy link

tiagoefreitas commented Feb 16, 2025

Description of the bug:

I was trying to add hidden context to gemini live audio to give it instructions during audio conversations that are hidden from the users (not only at the beggining). The gemini docs say we can add previous context with clientcontent but the model always responds, even if I add the model response as a turn with turncomplete=true like this:

{
  "clientContent": {
    "turns": [
      {
        "role": "user",
        "parts": [
          {
            "text": "Context xxx"
          }
        ]
      },
      {
        "role": "model",
        "parts": [
          {
            "text": "ok."
          }
        ]
      }
    ],
    "turnComplete": true
  }
}

The model will still reply in audio with "ok" again

And if turncomplete is false, the model will not reply to audio content until I send a text message to end the turn.

If I instruct in the prompt not to reply, it seems the model was trained to always reply so 90% of the times it says “ok” or something.

Actual vs expected behavior:

No response

Any other information you'd like to share?

No response

@gmKeshari gmKeshari added type:bug Something isn't working status:triaged Issue/PR triaged to the corresponding sub-team component:python sdk Issue/PR related to Python SDK labels Feb 17, 2025
@MarkDaoust MarkDaoust added component:api Issues related to the API, not the SDK. and removed component:python sdk Issue/PR related to Python SDK labels Feb 18, 2025
@MarkDaoust
Copy link
Collaborator

Is this something you can't do with the system instructions at the start of the conversation?

And if turncomplete is false, the model will not reply to audio content until I send a text message to end the turn.

I haven't tried mixing in client content without an end of turn. But this is not the behavior I would expect.

@tiagoefreitas
Copy link
Author

@MarkDaoust it doesn't follow the instructions 90% of the time, it still replies.
This is kind of expected as the model was likely trained with pairs of request/reply.

but with turncomplete=true and a model role turn, it should not reply again, but it does.

@MarkDaoust
Copy link
Collaborator

Thanks for the feedback, I've raised this with the internal API team.

@MarkDaoust MarkDaoust changed the title Live context Live - "client-content" without end of turn blocks voice response. Feb 20, 2025
@MarkDaoust MarkDaoust assigned MarkDaoust and unassigned pamorgan Mar 19, 2025
@MarkDaoust
Copy link
Collaborator

We're going to make turn_complete the default so this doesn't happen by accident.

@MarkDaoust MarkDaoust added type:feature request New feature request/enhancement and removed type:bug Something isn't working labels Apr 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:api Issues related to the API, not the SDK. status:triaged Issue/PR triaged to the corresponding sub-team type:feature request New feature request/enhancement
Projects
None yet
Development

No branches or pull requests

4 participants