-
Notifications
You must be signed in to change notification settings - Fork 146
Conversation API #789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: blaze/connection-provider
Are you sure you want to change the base?
Conversation API #789
Conversation
6ea1621
to
2f9bbee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally add some unit tests to that.
} | ||
} | ||
|
||
@Published public private(set) var agents: [Participant.Identity: Agent] = [:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think .ID
is still a better key than name itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because name isn't unique anyways - you can have many agents with the same agent_name and unique participant identities
|
||
// MARK: - Init | ||
|
||
public init(credentials: CredentialsProvider, room: Room = .init(), agentName: String? = nil, senders: [any MessageSender]? = nil, receivers: [any MessageReceiver]? = nil) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1egoman @lukasIO I think that's the discussion about the logic:
agentName
should take part in the direct dispatch- we'll introduce plural case later while keeping an internal array? it creates some confusion why the conversation cannot happen "with multiple agents"
- wait for agents can only happen at the conversation level (as the
Agent
will be published when joining)- I believe we should check the names vs who actually joined
- name shouldn't take part in the filtering? so that I'll keep an agent that I formally did not pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll introduce plural case later while keeping an internal array? it creates some confusion why the conversation cannot happen "with multiple agents"
This was generally what I had in mind yes, start with a singular agentName
-type parameter, and then in the future add an internal array which can be backed by a new agentNames
parameter (In swift I'd think it could probably be a new overload? On web, the way to accomplish the same thing would be that the parameter would now accept either a string
or Array<string>
, and renamed since parameter names aren't part of the external interface in js)
This link may be useful and represents the initial state of this on the web: https://github.com/livekit/components-js/pull/1207/files#diff-c2401cb9c778162d5def12d137a663f03477b02c804d9372846c318e903df77bR125-R146
wait for agents can only happen at the conversation level (as the Agent will be published when joining)
On the web right now, what I'm doing is in useConversation
I'm calling useAgent
to get access to the current agent, and then on the agent I exposed a method called waitForAvailable
which returns a promise which resolves once agent.isAvailable
is true. await agent.waitForAvailable()
is then being called in conversation.start()
.
Associated code here: https://github.com/livekit/components-js/pull/1207/files#diff-c2401cb9c778162d5def12d137a663f03477b02c804d9372846c318e903df77bR306-R346
In the future, what I had been thinking (and another use-case for conversation) is it could maintain a registry of all currently connected agents. I started going down the road of building that in javascript sort of with useAgentTimeoutStore
here (internal hook / not exposed). Right now it is largely responsible for agent timeout management but I could see that growing, storing data across multiple agents, and moving underneath conversation in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@1egoman I think like we're still scratching the surface with "absence" 🥲
My high-level approach (not implemented yet) would be:
- Conversation controls the (global) timeout (as it does the dispatch), agent does not know about its own timeout, etc. 🟢
- Currently, passing one
agentName
is a little misleading for people having multiple agents anyway (without direct dispatch)... - When
agentName
is passed:- we do wait for this particular agent to join
- should this wait be awaitable or background
- we do wait for this particular agent to join
- When multiple
agentNames
are passed:- do we wait for all?
- When no
agentName
is passed:- we do wait for any agent 🟠
- we do not wait at all?
Agent
is added to the registry when it joins the room (agent lifecycle == participant lifecycle), regardless of its (conversational)state 🟠 theoretically we can add them with"listening" | "thinking" | "speaking";
, maybe including"idle"
but it also limits the space of states (if we don't register them - with the guarantee that they won't disappear after turningidle
) or is a little confusing (if we register them with another concept of "availability"). IMO, consumers should learn whatAgentState
means for their use cases, it's hard to tell what "available" means universally.- I'm not sure if doing that
await agent.waitUntilAvailable(signal);
is a great idea - should it be awaitable if you still wanna present some placeholder UI while agent joins/dispatches; thisawait
can be modeled by agent's optionalityAgent?
/its tracks anyway 🟠 - How to model
iCanSpeak
state - which is crucial for the UI:- During pre-connect:
capturingAudio + .disconnected, .connecting, .reconnecting, .connected
- After pre-connect: probably
connectionState
(room) is not enough, but how to handle e.g. handoff - I'd rather interpret that as "I'm in the room and someone is listening" rather than "some agent X is listening" as there may be gaps, etc.
- During pre-connect:
The key difference is probably in how we think about "presence" - shall we make some arbitrary decisions here or no?
94ec7d0
to
e5caee2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this generally looks good - lmk when the API is considered final
} | ||
} | ||
|
||
@Published public private(set) var agents: [Participant.Identity: Agent] = [:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes because name isn't unique anyways - you can have many agents with the same agent_name and unique participant identities
aa93417
to
212035c
Compare
Adds 3 basic building blocks for simple(r) agent experiences:
Conversation
- connection, pre-connect, agent dispatch, agent filtering (e.g. by name), all agents, messages (broadcasted and aggregated for now)Agent
- wrapper aroundParticipant
, knows its tracks and internal stateLocalMedia
- (unrelated) helper to deal with local tracks in SwiftUIExample: livekit-examples/agent-starter-swift#29