Conversation API #789

pblazej · 2025-09-18T12:32:15Z

Adds 3 basic building blocks for simple(r) agent experiences:

Conversation - connection, pre-connect, agent dispatch, agent filtering (e.g. by name), all agents, messages (broadcasted and aggregated for now)
Agent - wrapper around Participant, knows its tracks and internal state
LocalMedia - (unrelated) helper to deal with local tracks in SwiftUI

Example: livekit-examples/agent-starter-swift#29

pblazej · 2025-09-18T12:43:55Z

Sources/LiveKit/Agent/Chat/Receive/TranscriptionStreamReceiver.swift

Finally add some unit tests to that.

pblazej · 2025-09-18T12:58:00Z

Sources/LiveKit/Agent/Conversation.swift

+        }
+    }
+
+    @Published public private(set) var agents: [Participant.Identity: Agent] = [:]


I think .ID is still a better key than name itself.

yes because name isn't unique anyways - you can have many agents with the same agent_name and unique participant identities

pblazej · 2025-09-18T12:59:56Z

Sources/LiveKit/Agent/Conversation.swift

+
+    // MARK: - Init
+
+    public init(credentials: CredentialsProvider, room: Room = .init(), agentName: String? = nil, senders: [any MessageSender]? = nil, receivers: [any MessageReceiver]? = nil) {


@1egoman @lukasIO I think that's the discussion about the logic:

agentName should take part in the direct dispatch

we'll introduce plural case later while keeping an internal array? it creates some confusion why the conversation cannot happen "with multiple agents"

wait for agents can only happen at the conversation level (as the Agent will be published when joining)

I believe we should check the names vs who actually joined

name shouldn't take part in the filtering? so that I'll keep an agent that I formally did not pass?

we'll introduce plural case later while keeping an internal array? it creates some confusion why the conversation cannot happen "with multiple agents"

This was generally what I had in mind yes, start with a singular agentName-type parameter, and then in the future add an internal array which can be backed by a new agentNames parameter (In swift I'd think it could probably be a new overload? On web, the way to accomplish the same thing would be that the parameter would now accept either a string or Array<string>, and renamed since parameter names aren't part of the external interface in js)

This link may be useful and represents the initial state of this on the web: https://github.com/livekit/components-js/pull/1207/files#diff-c2401cb9c778162d5def12d137a663f03477b02c804d9372846c318e903df77bR125-R146

wait for agents can only happen at the conversation level (as the Agent will be published when joining)

On the web right now, what I'm doing is in useConversation I'm calling useAgent to get access to the current agent, and then on the agent I exposed a method called waitForAvailable which returns a promise which resolves once agent.isAvailable is true. await agent.waitForAvailable() is then being called in conversation.start().

Associated code here: https://github.com/livekit/components-js/pull/1207/files#diff-c2401cb9c778162d5def12d137a663f03477b02c804d9372846c318e903df77bR306-R346

In the future, what I had been thinking (and another use-case for conversation) is it could maintain a registry of all currently connected agents. I started going down the road of building that in javascript sort of with useAgentTimeoutStore here (internal hook / not exposed). Right now it is largely responsible for agent timeout management but I could see that growing, storing data across multiple agents, and moving underneath conversation in the future.

@1egoman I think like we're still scratching the surface with "absence" 🥲

My high-level approach (not implemented yet) would be:

Conversation controls the (global) timeout (as it does the dispatch), agent does not know about its own timeout, etc. 🟢

Currently, passing one agentName is a little misleading for people having multiple agents anyway (without direct dispatch)...

When agentName is passed:

we do wait for this particular agent to join

should this wait be awaitable or background

When multiple agentNames are passed:

do we wait for all?

When no agentName is passed:

we do wait for any agent 🟠

we do not wait at all?

Agent is added to the registry when it joins the room (agent lifecycle == participant lifecycle), regardless of its (conversational)state 🟠 theoretically we can add them with "listening" | "thinking" | "speaking";, maybe including "idle" but it also limits the space of states (if we don't register them - with the guarantee that they won't disappear after turning idle) or is a little confusing (if we register them with another concept of "availability"). IMO, consumers should learn what AgentState means for their use cases, it's hard to tell what "available" means universally.

I'm not sure if doing that await agent.waitUntilAvailable(signal); is a great idea - should it be awaitable if you still wanna present some placeholder UI while agent joins/dispatches; this await can be modeled by agent's optionality Agent?/its tracks anyway 🟠

How to model iCanSpeak state - which is crucial for the UI:

During pre-connect: capturingAudio + .disconnected, .connecting, .reconnecting, .connected

After pre-connect: probably connectionState (room) is not enough, but how to handle e.g. handoff - I'd rather interpret that as "I'm in the room and someone is listening" rather than "some agent X is listening" as there may be gaps, etc.

The key difference is probably in how we think about "presence" - shall we make some arbitrary decisions here or no?

bcherry

this generally looks good - lmk when the API is considered final

bcherry · 2025-09-22T22:53:53Z

Sources/LiveKit/Agent/Conversation.swift

+        }
+    }
+
+    @Published public private(set) var agents: [Participant.Identity: Agent] = [:]


yes because name isn't unique anyways - you can have many agents with the same agent_name and unique participant identities

pblazej force-pushed the blaze/agent-conversation branch from 6ea1621 to 2f9bbee Compare September 18, 2025 12:38

pblazej commented Sep 18, 2025

View reviewed changes

Sources/LiveKit/Agent/Chat/Receive/TranscriptionStreamReceiver.swift

Copy link

Contributor Author

pblazej Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally add some unit tests to that.

pblazej commented Sep 18, 2025

View reviewed changes

pblazej requested review from 1egoman, hiroshihorie, davidliu and lukasIO September 18, 2025 13:08

pblazej force-pushed the blaze/agent-conversation branch 2 times, most recently from 94ec7d0 to e5caee2 Compare September 18, 2025 13:34

bcherry reviewed Sep 22, 2025

View reviewed changes

pblazej added 7 commits September 23, 2025 14:01

Move basic Agent files

f342b17

Fix inconsistencies

f1dc53c

Media state from participant

cd27f1d

Naming

100abcd

Attributes gen

a4ab04c

Transcription tests

1ea3f56

Extract tests

212035c

pblazej force-pushed the blaze/agent-conversation branch from aa93417 to 212035c Compare September 23, 2025 12:02

Renaming

c52f944

pblazej marked this pull request as draft October 1, 2025 08:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conversation API #789

Conversation API #789

Uh oh!

pblazej commented Sep 18, 2025 •

edited

Loading

Uh oh!

pblazej Sep 18, 2025

Uh oh!

pblazej Sep 18, 2025

Uh oh!

bcherry Sep 22, 2025

Uh oh!

pblazej Sep 18, 2025

Uh oh!

1egoman Sep 18, 2025 •

edited

Loading

Uh oh!

pblazej Sep 19, 2025

Uh oh!

bcherry left a comment

Uh oh!

bcherry Sep 22, 2025

Uh oh!

Uh oh!


		// MARK: - Init

		public init(credentials: CredentialsProvider, room: Room = .init(), agentName: String? = nil, senders: [any MessageSender]? = nil, receivers: [any MessageReceiver]? = nil) {

Conversation API #789

Are you sure you want to change the base?

Conversation API #789

Uh oh!

Conversation

pblazej commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pblazej Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

pblazej Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

bcherry Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

pblazej Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

1egoman Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pblazej Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

bcherry left a comment

Choose a reason for hiding this comment

Uh oh!

bcherry Sep 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pblazej commented Sep 18, 2025 •

edited

Loading

1egoman Sep 18, 2025 •

edited

Loading