-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(agent): reduce orchestrator prompt bloat #3367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| id = "desktop_control_agent" | ||
| display_name = "Desktop Control Agent" | ||
| delegate_name = "delegate_desktop_control" | ||
| when_to_use = "Desktop control specialist — launches desktop apps and operates native UI through accessibility, automation, screenshot, mouse, and keyboard tools. Owns list-before-press behavior, foreground-first input, fallback from AX to keyboard/mouse, and sensitive-app constraints." | ||
| temperature = 0.2 | ||
| max_iterations = 8 | ||
| agent_tier = "worker" | ||
| omit_identity = true | ||
| omit_memory_context = true | ||
| omit_safety_preamble = false | ||
| omit_skills_catalog = true | ||
| omit_profile = true | ||
| omit_memory_md = true | ||
|
|
||
| [model] | ||
| hint = "agentic" | ||
|
|
||
| [tools] | ||
| named = [ | ||
| "launch_app", | ||
| "ax_interact", | ||
| "automate", | ||
| "screenshot", | ||
| "mouse", | ||
| "keyboard", | ||
| "ask_user_clarification", | ||
| ] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| pub mod prompt; |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Desktop Control Agent | ||
|
|
||
| You are the desktop-control specialist. Launch apps and operate native desktop UI through accessibility, automation, screenshot, mouse, and keyboard tools. | ||
|
|
||
| ## Rules | ||
|
|
||
| - Use `launch_app` for explicit app-launch requests. | ||
| - Use `ax_interact` for semantic accessibility interactions. | ||
| - Always call `ax_interact` with `action:"list"` before `press` or `set_value`. | ||
| - Use `automate` for multi-step app workflows, such as playing a song in Music or sending a message in Slack. | ||
| - Before any keyboard or mouse action, foreground the target app with `launch_app`. | ||
| - Prefer `automate` or `ax_interact` first. If the accessibility tree is empty, stuck, or only shows menu-bar items, fall back to keyboard-driven control for Electron/Chromium apps. | ||
| - Use `screenshot` plus `mouse` only when semantic or keyboard control cannot target the needed element. | ||
| - Never invent element labels. Act only on elements returned by `list` or clearly named by the user. | ||
| - Respect sensitive-app constraints and tool denials. Do not work around password managers, Keychain, System Settings, terminals, or other denied surfaces. | ||
| - If the target app or UI element is unclear, call `ask_user_clarification`. | ||
| - Report approval, denial, unsupported-platform, and not-found outcomes plainly. | ||
|
|
||
| ## Output | ||
|
|
||
| Return a compact result for the parent: | ||
|
|
||
| - Answer | ||
| - Evidence used | ||
| - Actions taken | ||
| - Open uncertainties | ||
| - Failed tool calls | ||
| - Recommended next step |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| //! System prompt builder for the `desktop_control_agent` built-in agent. | ||
|
|
||
| use crate::openhuman::context::prompt::{ | ||
| render_tools, render_user_files, render_workspace, PromptContext, | ||
| }; | ||
| use anyhow::Result; | ||
|
|
||
| const ARCHETYPE: &str = include_str!("prompt.md"); | ||
|
|
||
| pub fn build(ctx: &PromptContext<'_>) -> Result<String> { | ||
| let mut out = String::with_capacity(4096); | ||
| out.push_str(ARCHETYPE.trim_end()); | ||
| out.push_str("\n\n"); | ||
|
|
||
| let user_files = render_user_files(ctx)?; | ||
| if !user_files.trim().is_empty() { | ||
| out.push_str(user_files.trim_end()); | ||
| out.push_str("\n\n"); | ||
| } | ||
|
|
||
| let tools = render_tools(ctx)?; | ||
| if !tools.trim().is_empty() { | ||
| out.push_str(tools.trim_end()); | ||
| out.push_str("\n\n"); | ||
| } | ||
|
|
||
| let workspace = render_workspace(ctx)?; | ||
| if !workspace.trim().is_empty() { | ||
| out.push_str(workspace.trim_end()); | ||
| out.push('\n'); | ||
| } | ||
|
|
||
| Ok(out) | ||
| } | ||
|
Comment on lines
+10
to
+34
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win Add required debug/trace instrumentation in prompt assembly flow.
Proposed patch use anyhow::Result;
+use tracing::{debug, trace};
const ARCHETYPE: &str = include_str!("prompt.md");
pub fn build(ctx: &PromptContext<'_>) -> Result<String> {
+ debug!(
+ agent_id = %ctx.agent_id,
+ "[desktop_control_agent::prompt] build start"
+ );
let mut out = String::with_capacity(4096);
out.push_str(ARCHETYPE.trim_end());
out.push_str("\n\n");
+ trace!(agent_id = %ctx.agent_id, section = "user_files", "[desktop_control_agent::prompt] rendering section");
let user_files = render_user_files(ctx)?;
if !user_files.trim().is_empty() {
+ debug!(agent_id = %ctx.agent_id, section = "user_files", "[desktop_control_agent::prompt] section included");
out.push_str(user_files.trim_end());
out.push_str("\n\n");
+ } else {
+ trace!(agent_id = %ctx.agent_id, section = "user_files", "[desktop_control_agent::prompt] section skipped_empty");
}
+ trace!(agent_id = %ctx.agent_id, section = "tools", "[desktop_control_agent::prompt] rendering section");
let tools = render_tools(ctx)?;
if !tools.trim().is_empty() {
+ debug!(agent_id = %ctx.agent_id, section = "tools", "[desktop_control_agent::prompt] section included");
out.push_str(tools.trim_end());
out.push_str("\n\n");
+ } else {
+ trace!(agent_id = %ctx.agent_id, section = "tools", "[desktop_control_agent::prompt] section skipped_empty");
}
+ trace!(agent_id = %ctx.agent_id, section = "workspace", "[desktop_control_agent::prompt] rendering section");
let workspace = render_workspace(ctx)?;
if !workspace.trim().is_empty() {
+ debug!(agent_id = %ctx.agent_id, section = "workspace", "[desktop_control_agent::prompt] section included");
out.push_str(workspace.trim_end());
out.push('\n');
+ } else {
+ trace!(agent_id = %ctx.agent_id, section = "workspace", "[desktop_control_agent::prompt] section skipped_empty");
}
+ debug!(
+ agent_id = %ctx.agent_id,
+ final_chars = out.chars().count(),
+ "[desktop_control_agent::prompt] build done"
+ );
Ok(out)
}As per coding guidelines: “Log entry/exit, branches, external calls, retries/timeouts, state transitions, and errors with stable grep-friendly prefixes” and “Add substantial debug-level logs while implementing features or fixes in Rust using 🤖 Prompt for AI Agents |
||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| use crate::openhuman::context::prompt::{LearnedContextData, ToolCallFormat}; | ||
| use std::collections::HashSet; | ||
|
|
||
| #[test] | ||
| fn build_returns_desktop_control_contract() { | ||
| let visible = HashSet::new(); | ||
| let ctx = PromptContext { | ||
| workspace_dir: std::path::Path::new("."), | ||
| model_name: "test", | ||
| agent_id: "desktop_control_agent", | ||
| tools: &[], | ||
| skills: &[], | ||
| dispatcher_instructions: "", | ||
| learned: LearnedContextData::default(), | ||
| visible_tool_names: &visible, | ||
| tool_call_format: ToolCallFormat::PFormat, | ||
| connected_integrations: &[], | ||
| connected_identities_md: String::new(), | ||
| include_profile: false, | ||
| include_memory_md: false, | ||
| curated_snapshot: None, | ||
| user_identity: None, | ||
| personality_soul_md: None, | ||
| personality_memory_md: None, | ||
| personality_roster: vec![], | ||
| workflows: &[], | ||
| }; | ||
| let body = build(&ctx).unwrap(); | ||
| assert!(body.contains("Desktop Control Agent")); | ||
| assert!(body.contains("action:\"list\"")); | ||
| assert!(body.contains("sensitive-app")); | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor / parity note:
render_structured_handoffreshapes every archetype delegation, not just the new specialists.ArchetypeDelegationToolbacks all delegates (crypto, integrations, etc.), so every child prompt is now wrapped inTask:\n…even when no structured fields are passed. It's additive and well-tested, but the "existingpromptdelegation remains compatible" parity claim is a bit stronger than reality — the literal child-prompt string changes for all delegates. Worth a one-line note in the parity section, and consider skipping theTask:\nwrapper when no structured fields are present to keep the legacy path byte-identical.