Skip to content

Proposal: evented extension architecture for prompts, tools, and steering #134

@sounkou-bioinfo

Description

@sounkou-bioinfo

Context

This is a developer-facing proposal based on a comparison of corteza with earendil-works/pi and nousresearch/hermes-agent, plus the current cornball-ai/llm.api loop that corteza delegates to.

The goal is to help corteza evolve without losing its main advantage: a persistent, live R session that the agent can reason about and operate on.

Problem

Corteza already has powerful pieces:

  • persistent R session state;
  • R-native tools/skills;
  • SKILL.md prompt docs;
  • policy and approval handling;
  • MCP exposure;
  • context compaction and session persistence;
  • observers such as add_observer().

But these pieces are not yet unified as a first-class extension architecture.

Currently, turn() prepares tools/system/history and then delegates the actual model/tool loop to llm.api::agent():

chat()/CLI/MCP
  -> session_setup()
  -> new_session()
  -> turn(prompt, session)
  -> llm.api::agent(..., tool_handler = .make_tool_handler(...))

Relevant corteza references:

  • R/turn.R: new_session() holds mutable session state.
  • R/turn.R: .make_tool_handler() wraps tool execution with policy, approval, dry-run, task interception, and observers.
  • R/turn.R: turn() calls llm.api::agent().
  • R/context.R: load_context() assembles the system prompt.
  • R/skill.R: skill_spec() defines executable tools/skills and SKILL.md support.
  • R/registry.R: register_skill() stores skills in the global registry.

This makes features like the following harder than they should be:

  • live steering while an agent is running;
  • extension-driven system prompt changes;
  • prompt templates;
  • tool-result transformations such as truncation/redaction/stashing;
  • custom compaction behavior;
  • provider payload inspection;
  • extension packages as normal R packages;
  • dynamic active tool sets;
  • extension-scoped permissions.

Lessons from pi-mono

Pi separates concepts clearly:

agent core loop
provider/auth layer
coding-agent/session layer
extension system
tools
skills
prompt templates
TUI

Most importantly, pi distinguishes extensions from tools and skills.

  • A tool is callable by the LLM.
  • A skill or prompt template is prompt/workflow guidance.
  • An extension can register tools, subscribe to lifecycle events, modify prompts/context, block or transform tool calls/results, add commands, and interact with session state.

Pi’s extension API exposes lifecycle events such as:

  • before_agent_start
  • context
  • before_provider_request
  • after_provider_response
  • tool_call
  • tool_result
  • message_end
  • turn_start / turn_end

This makes features like steering and tool-result transformation natural instead of ad hoc.

Lessons from hermes-agent

Hermes-agent is much larger and more product-specific, but two ideas are useful:

  1. Prompt tiers: stable/context/volatile prompt parts.
  2. Behavior-affecting hooks: a few hooks can return controlled mutations, e.g. inject context, block tool calls, transform tool results, or transform final LLM output.

The prompt-tier idea is particularly relevant for corteza because R session state, project context, tool guidance, and volatile runtime information should not all be treated as one opaque string.

Proposal

Add a first-class event/hook and extension architecture to corteza.

1. Introduce lifecycle events

Possible public API:

register_hook("before_turn", handler)
register_hook("before_system_prompt", handler)
register_hook("after_system_prompt", handler)
register_hook("context", handler)
register_hook("tool_call", handler)
register_hook("tool_result", handler)
register_hook("message_end", handler)
register_hook("steering_received", handler)
register_hook("turn_end", handler)

Handlers receive:

handler <- function(event, ctx) {
  ...
}

Where ctx includes session/config/cwd and safe host capabilities.

Some hooks should be observe-only. Others should support controlled return values.

Example: tool-result truncation/redaction/stashing.

register_hook("tool_result", function(event, ctx) {
  if (event$tool == "bash" && nchar(event$result) > 20000) {
    list(result = paste0(substr(event$result, 1, 12000), "\n\n[truncated]"))
  }
})

Example: prompt mutation.

register_hook("after_system_prompt", function(event, ctx) {
  list(system = paste(event$system, readLines("guidance.md"), sep = "\n\n"))
})

2. Refactor system prompt assembly into structured parts

Instead of only returning one string from load_context(), expose prompt parts:

parts <- build_system_prompt_parts(session)
parts <- emit_event(session, "before_system_prompt", parts)
system <- render_system_prompt_parts(parts)
system <- emit_event(session, "after_system_prompt", system)

Suggested tiers:

list(
  stable = list(
    identity = ...,
    tool_guidance = ...,
    skill_docs = ...
  ),
  context = list(
    project_briefing = ...,
    context_files = ...,
    package_docs = ...
  ),
  volatile = list(
    workspace_summary = ...,
    task_state = ...,
    memory = ...,
    current_time = ...
  )
)

This would make prompt changes inspectable, cacheable, and permissionable.

3. Make extensions distinct from tools/skills

Proposed extension layout:

.corteza/extensions/my-extension/
  DESCRIPTION
  extension.R
  tools/
  prompts/
  skills/
  hooks/
  config.json

Also support normal R packages that expose a corteza_extension() registration function.

Public APIs could include:

register_extension()
register_tool()
register_hook()
register_prompt_template()
register_slash_command()

4. Preserve backward compatibility

add_observer() can remain, but be implemented as a compatibility layer over generalized events.

skill_spec() can remain internally, but a public register_tool() / corteza_tool() API should wrap it.

Steering as a motivating feature

The console currently supports interruption, not live steering. While the agent is running, the synchronous REPL is blocked inside turn() / llm.api::agent() and cannot naturally accept new user messages.

The important recommendation is not to add steering as an ad hoc /steer hack.

Instead, steering should fall out of a general evented loop abstraction:

register_hook("steering_received", function(event, ctx) {
  list(action = "inject", messages = list(
    list(role = "user", content = paste0("[Steering]\n", event$text))
  ))
})

A future async CLI/console can queue steering messages; the loop can consume them at safe continuation points.

This likely needs cooperation from llm.api, because llm.api::agent() owns the inner model/tool loop.

Suggested first milestone

  1. Add R/events.R with register_hook() and emit_event().
  2. Emit events around turn() and .make_tool_handler():
    • before_turn
    • before_system_prompt
    • after_system_prompt
    • tool_call
    • tool_result
    • turn_end
    • error
  3. Refactor prompt construction into structured parts.
  4. Add a public register_tool() API wrapping existing skill machinery.
  5. Coordinate with llm.api for pending-message / steering continuation points.

Why this helps corteza

This makes corteza easier to extend while keeping it small and R-native.

It would enable contributors to help corteza by building:

  • prompt packs;
  • R-domain tool packages;
  • RStudio/Positron workflows;
  • data-analysis skills;
  • tool-result transformers;
  • permission extensions;
  • steering/interrupt policies;
  • custom compaction and memory systems;
  • provider instrumentation.

Most importantly, it keeps corteza’s core identity intact: an agent runtime with direct access to live R state.


🟨 Powered by corteza

Co-Authored-By: corteza noreply@cornball.ai

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions