The FunctionRegistry semantic router is intended to match incoming tasks against previously cached and verified functions to avoid redundant code generation. In practice, when tasks are submitted through a conversational interface where message history grows over time, the embedding of the task drifts with each call — even for semantically identical requests — causing the router to miss cache hits and generate new code every time.
Steps to Reproduce
# Submit the same task three times in a chat session with growing history
tasks = [
"fetch profile for user 1",
"fetch profile for user 1", # identical
"fetch profile for user 1", # identical
]
for task in tasks:
result = await agent.execute_task({"task": task, "history": growing_chat_history})
# Observed: all three trigger a full CoderSubAgent code generation cycle
# Expected: calls 2 and 3 hit the registry
Root Cause
The task embedding is generated from the full task dict including conversation history. As history grows, the embedding changes even when the core task intent is identical. The registry has no intent normalization layer — it stores functions keyed by raw task description embeddings, not canonical intent embeddings.
Proposed Solution
Separate task embedding into two stages:
- Intent extraction: A lightweight pass that strips conversational context and extracts the canonical intent (e.g.
"fetch_user_profile", "update_schedule").
- Registry lookup: Match on the canonical intent embedding, not the full task string embedding.
Impact
Without reliable registry hits, every request triggers a full code generation cycle. The primary cost and latency savings of AutoAgent — verified function reuse — do not materialize in real-world conversational deployments, making the FunctionRegistry effectively inert.
The
FunctionRegistrysemantic router is intended to match incoming tasks against previously cached and verified functions to avoid redundant code generation. In practice, when tasks are submitted through a conversational interface where message history grows over time, the embedding of the task drifts with each call — even for semantically identical requests — causing the router to miss cache hits and generate new code every time.Steps to Reproduce
Root Cause
The task embedding is generated from the full task dict including conversation history. As history grows, the embedding changes even when the core task intent is identical. The registry has no intent normalization layer — it stores functions keyed by raw task description embeddings, not canonical intent embeddings.
Proposed Solution
Separate task embedding into two stages:
"fetch_user_profile","update_schedule").Impact
Without reliable registry hits, every request triggers a full code generation cycle. The primary cost and latency savings of AutoAgent — verified function reuse — do not materialize in real-world conversational deployments, making the FunctionRegistry effectively inert.