Skip to content

Conversation

@Hormold
Copy link
Contributor

@Hormold Hormold commented Nov 26, 2025

Supervisor aka Observer aka Guardrails sub-agent

This PR introduces background sub-agent that watches conversations and injects advice to agent in real-time.

Background: While working on outbound calling (especially cold calling), I hit a wall - fast TTFT is critical to sound human, so the primary agent must be lightweight (compact prompt + blazing-fast LLM=slight dumb). But complex rules overload the prompt and slow it down. Discovered this pattern: offload monitoring to a separate, parallel thinking model.

  • Background thinking observer watching every conversation
  • Catches what the realtime agent misses
  • Keeps agent prompt clean, monitoring logic separate and non-blocking

How it works:

  1. Listens to conversation_item_added events
  2. Every N agent responses, async evaluates transcript with observer LLM
  3. If issue detected, injects advice into agent's chat_ctx
  4. Agent sees it on next response

Async - evaluation takes 1-2s but runs in background, no latency impact
After agent response - ensures complete exchange, avoids partial eval during barge-in
Injects as system message - agent sees it in context, uses it on next turn

Example

# Outbound sales calls - agent pitches product, guardrail watches for compliance
session = AgentSession(
    llm="openai/gpt-4o-mini",  # fast for natural conversation
    guardrail=Guardrail(
        llm="deepseek-ai/deepseek-v3",  # reasons about context
        instructions="""
        Sales compliance monitor:
        - If discussing pricing, agent must mention "terms may vary"
        - If customer sounds hesitant, don't push - offer to send info instead
        - If customer mentions they're on fixed income, flag as vulnerable
        - Never let agent promise specific savings without disclaimer

        These rules are too nuanced for the sales script. Watch and intervene.
        """,
    ),
)

Configuration

  • instructions (required) - what to watch for, what advice to give
  • llm (required) - model for evaluation
  • eval_interval=3 - check every N agent responses (balance cost vs responsiveness)
  • max_interventions=5 - cap per session (prevent advice spam)
  • cooldown=10.0 - min seconds between advice (natural pacing)
  • inject_role="system" - role for injected message (see open question 1)
  • inject_prefix="[GUARDRAIL ADVISOR]:" - adds context to advice, e.g. "A supervisor is watching this conversation and suggests: ..." Lets you control how agent perceives the advice

Open questions

  • Do all LLMs support multiple system messages or system messages in the middle of a conversation? The "developer" type was perfect for that, but it was deprecated from most LLMs some time ago.
  • Should primary realtime be aware about observer injections
  • How does it preform in realtime - is any one tested on big scale same pattern?
  • RAG integration: allow user to hook contextual retrieval before evaluation? Observer could pull relevant docs/policies based on conversation topic in the background

Synthetic Benchmarks Test Results (9 Scenarios, 4 Models):

Ran ~90 tests covering different LLMs and scenarios. Focused on booking, support, and frontdesk receptionist use cases using 4 LLMs - customer, realtime agent, observer and judge.

deepseek-v3: 86% precision, 100% recall, 86% follow rate, ~$0.06/100 evals
gpt-4o-mini: 86% precision, 100% recall, 86% follow rate, ~$0.03/100 evals
gpt-4o: 86% precision, 100% recall, 86% follow rate, ~$0.55/100 evals
gpt-5: 100% precision, 67% recall, 50% follow rate, ~$2/100 evals

Findings: cheap models work same as expensive. gpt-5 actually worst (overthinks).
Best result: deepseek-v3 or gpt-4o-mini for observer. And not even thinking :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants