feat(guardrail): add conversation monitoring for AgentSession #4105
+477
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Supervisor aka Observer aka Guardrails sub-agent
This PR introduces background sub-agent that watches conversations and injects advice to agent in real-time.
Background: While working on outbound calling (especially cold calling), I hit a wall - fast TTFT is critical to sound human, so the primary agent must be lightweight (compact prompt + blazing-fast LLM=slight dumb). But complex rules overload the prompt and slow it down. Discovered this pattern: offload monitoring to a separate, parallel thinking model.
How it works:
Async - evaluation takes 1-2s but runs in background, no latency impact
After agent response - ensures complete exchange, avoids partial eval during barge-in
Injects as system message - agent sees it in context, uses it on next turn
Example
Configuration
Open questions
Synthetic Benchmarks Test Results (9 Scenarios, 4 Models):
Ran ~90 tests covering different LLMs and scenarios. Focused on booking, support, and frontdesk receptionist use cases using 4 LLMs - customer, realtime agent, observer and judge.
deepseek-v3: 86% precision, 100% recall, 86% follow rate, ~$0.06/100 evals
gpt-4o-mini: 86% precision, 100% recall, 86% follow rate, ~$0.03/100 evals
gpt-4o: 86% precision, 100% recall, 86% follow rate, ~$0.55/100 evals
gpt-5: 100% precision, 67% recall, 50% follow rate, ~$2/100 evals
Findings: cheap models work same as expensive. gpt-5 actually worst (overthinks).
Best result: deepseek-v3 or gpt-4o-mini for observer. And not even thinking :-)