14 Prompt, Skills, and MCP Guide

This guide explains how one DeepScientist turn is actually driven.

Use it when you want to understand:

how the runtime builds the prompt for each turn
what each stage skill is for
how the built-in MCP tools are structured
which file or tool you should change when behavior feels wrong

If you only want the user-facing product overview, read 13 Core Architecture Guide first.

If you only want the built-in memory contract, read 07 Memory and MCP after this page.

1. One-sentence summary

DeepScientist does not run from one static mega-prompt.

For every turn, it rebuilds a prompt from:

the core system prompt
a shared interaction contract
runtime state
quest files
startup contract
selected memory
connector-specific rules when needed
the active skill structure

Then the agent works through three built-in MCP namespaces only:

memory
artifact
bash_exec

2. What files are the main prompt truth sources

The most important files are:

src/prompts/system.md
src/prompts/contracts/shared_interaction.md
src/prompts/connectors/qq.md
src/prompts/connectors/weixin.md
src/prompts/connectors/lingzhu.md
src/deepscientist/prompts/builder.py
src/skills/*/SKILL.md
src/deepscientist/mcp/server.py

In practice:

system.md defines the global operating stance
shared_interaction.md defines user-visible continuity rules
connectors/*.md inject connector-specific behavior only when that connector is active or bound
builder.py decides prompt assembly order and runtime context sections
SKILL.md files define stage-specific execution discipline
mcp/server.py defines the built-in tool surface

3. How one turn prompt is assembled

The current runtime assembles the turn prompt in roughly this order:

system.md
contracts/shared_interaction.md
runtime context block
active communication surface block
optional connector contract block
turn driver and continuation guard
active user requirements
quest context
recent durable state
research delivery policy
paper and evidence snapshot
retry recovery packet when this is a retry turn
interaction style block
priority memory for this turn
recent conversation window
current turn attachments
current user message

That order matters.

The runtime is trying to answer three questions before the model acts:

what is the quest trying to do now
what durable state already exists
what behavior rules apply on this surface and in this stage

4. What the major prompt blocks actually do

4.1 `system.md`

This is the global DeepScientist operating contract.

It defines things like:

long-horizon evidence-first behavior
use bash_exec for shell-like execution
use durable files, logs, and artifacts as truth
do not end a quest early
treat web, TUI, and connectors as one quest
user-facing reporting style
baseline confirmation discipline, including preserving the richer metric surface instead of keeping only one headline scalar when the source baseline exposes multiple comparable metrics or variants

If the agent starts sounding wrong everywhere, system.md is one of the first places to inspect.

4.2 `shared_interaction.md`

This file defines the common continuity spine around artifact.interact(...).

It tells the agent:

artifact.interact(...) is the main user-visible thread
queued inbound user messages must be acknowledged and handled first
blocking replies are for real decisions only
progress updates should be concise and human-readable
the real user-facing interaction message should stay complete; the runtime may derive a shorter preview separately, so the agent should not manually truncate the actual connector answer with ... / …

If the model is bad at staying in the same long-running thread, this file matters a lot.

4.3 Active communication surface

The prompt builder adds a surface block each turn.

This tells the model:

whether the current turn is local, QQ, Weixin, or another connector
how many external connectors are bound
which surface is active right now
how much detail is appropriate on that surface

This is why connector behavior should not be hard-coded globally.

The same quest may be viewed from the web UI, TUI, or a connector, but the reply shape should adapt.

4.4 Connector contract

Connector prompt fragments are loaded only when needed.

Current connector prompt files are:

src/prompts/connectors/qq.md
src/prompts/connectors/weixin.md
src/prompts/connectors/lingzhu.md

These files define surface-specific rules such as:

reply length
text-first versus media-enabled behavior
how attachments should be sent
what not to expose in chat

For example:

QQ is treated as a milestone operator surface
Weixin is treated as a concise phone-side operator surface with context_token continuity
Lingzhu is treated as an even shorter, more constrained surface

If one connector needs behavior changes, change its connector prompt first before bloating the global system prompt.

4.5 Runtime context and durable quest state

The builder injects runtime facts such as:

quest_id
quest_root
current workspace branch
active idea id
active analysis campaign id
bound conversations
startup contract
baseline gate
active interactions
recent artifacts
recent runs

This is what makes the prompt quest-aware instead of generic.

4.6 Quest context

The builder reads these quest files directly into the prompt:

brief.md
plan.md
status.md
SUMMARY.md

This is important:

the live prompt is not only based on chat history
durable quest docs are treated as first-class truth surfaces

4.7 Research delivery policy

This block converts startup choices into concrete execution rules.

It includes logic around:

whether paper delivery is required
launch mode
custom profile
baseline routing
idea routing
paper branch behavior
review gate behavior

If Start Research behavior feels wrong, you usually need to inspect:

the startup_contract
src/deepscientist/prompts/builder.py
the stage skill the quest is currently using

4.8 Interaction style

This block tells the model how to speak on this turn.

It includes:

locale bias
blocking versus threaded behavior
long-run update cadence
how to acknowledge mailbox messages
how to compress progress into human-readable updates

This is why DeepScientist can keep the same runtime but behave differently across:

long experiments
connector replies
writing stages
waiting-for-decision stages

4.9 Priority memory

DeepScientist does not inject memory randomly.

PromptBuilder uses a stage-specific memory plan.

Examples:

scout prefers papers, knowledge, decisions
baseline prefers papers, decisions, episodes, knowledge
idea prefers papers, ideas, decisions, knowledge
experiment prefers ideas, decisions, episodes, knowledge

This means the prompt is stage-biased on purpose.

The agent should not see the same memory bundle on every turn.

5. What can be overridden locally

Prompt fragments can be overridden per quest under:

.codex/prompts/system.md
.codex/prompts/contracts/shared_interaction.md
.codex/prompts/connectors/<connector>.md

This means:

repo defaults live under src/prompts/
quest-local prompt overrides live under .codex/prompts/

Use quest-local prompt overrides only when the quest truly needs a different local contract.

Do not fork the repo-level prompt unless the change should affect the product globally.

6. How skills are structured

DeepScientist currently has two skill layers:

standard stage skills
companion skills

6.1 Standard stage skills

These are the main research anchors:

Skill	Use when	Main job	Usually hands off to
`scout`	the task frame is still unclear	task framing, baseline discovery, metric and dataset clarification	`baseline` or `idea`
`baseline`	a trustworthy baseline does not yet exist	attach, import, reproduce, repair, and verify the baseline	`idea`
`idea`	the baseline is clear but the next direction is not	generate, compare, and select durable research directions	`experiment`
`experiment`	one selected idea is ready to run	implement and evaluate the main run on one durable line	`analysis-campaign`, `write`, or `decision`
`analysis-campaign`	follow-up experiments are needed	run slices, ablations, robustness checks, or reviewer-facing supplements	`write`, `decision`, or `finalize`
`write`	there is enough evidence to draft	turn accepted evidence into outline, draft, and paper bundle work	`review` or `finalize`
`finalize`	the quest is near closure	consolidate claims, summaries, final state, and closure checks	quest completion approval
`decision`	a durable route choice is needed	make a clear go/stop/branch/reuse decision from evidence	another anchor

6.2 Companion skills

These are auxiliary entry or quality-control skills:

Skill	Use when	Main job
`figure-polish`	a figure is important beyond debug use	render-inspect-revise a milestone or paper figure
`intake-audit`	the quest already has meaningful prior state	trust-rank old assets and choose the correct next anchor
`review`	a substantial draft already exists	run a skeptical paper-like audit before claiming done
`rebuttal`	reviewer comments or revision requests exist	map reviewer pressure into experiments, text deltas, and response artifacts

6.3 The important design point

The daemon is not supposed to contain a giant hard-coded research scheduler.

Instead:

the prompt defines the operating contract
the skill defines stage-specific discipline
the runtime persists state and routes turns

That is the core DeepScientist design choice.

7. What each skill usually leaves behind

These are the durable outputs you should expect:

Skill	Typical durable outputs
`scout`	updated `brief.md`, updated `plan.md`, literature notes, framing memory
`baseline`	`PLAN.md`, `CHECKLIST.md`, baseline verification notes, confirmed or waived baseline state
`idea`	durable idea draft, selected idea package, rationale for why this route won
`experiment`	implementation changes, run logs, `record_main_experiment(...)`, result evidence
`analysis-campaign`	campaign manifest, slice records, synthesis notes
`write`	selected outline, writing plan, draft, references, claim-evidence map, paper bundle
`finalize`	final summary, closure state, final quest health check
`decision`	durable route decision, next-anchor recommendation
`intake-audit`	trusted-versus-untrusted asset map, next anchor recommendation
`review`	review report, revision log, experiment TODO list
`rebuttal`	review matrix, response letter, text deltas, evidence-update plan
`figure-polish`	final polished figure assets and render-checked outputs

8. Built-in MCP structure

DeepScientist keeps the built-in MCP surface intentionally small.

Only these namespaces are built in:

memory
artifact
bash_exec

There is no separate public built-in git namespace.

Git-aware behavior is exposed through artifact.

8.1 `memory`

Purpose:

reusable knowledge
lessons that should survive beyond one turn
quest-local or global memory cards

Current built-in tools:

memory.write(...)
memory.read(...)
memory.search(...)
memory.list_recent(...)
memory.promote_to_global(...)

Use memory when the output should be remembered and reused later.

Do not use it for transient progress chatter.

8.2 `artifact`

Purpose:

quest control plane
durable research state
Git-aware branch and worktree routing
experiment and paper records
user-visible interaction continuity

The artifact namespace is large, but it is still one family.

A. Generic durable records

artifact.record(...)
artifact.refresh_summary(...)
artifact.render_git_graph(...)

B. Branch and route control

artifact.checkpoint(...)
artifact.prepare_branch(...)
artifact.activate_branch(...)
artifact.submit_idea(...)
artifact.list_research_branches(...)
artifact.resolve_runtime_refs(...)

C. Baseline lifecycle

artifact.publish_baseline(...)
artifact.attach_baseline(...)
artifact.confirm_baseline(...)
artifact.waive_baseline(...)

D. Experiment and analysis lifecycle

artifact.record_main_experiment(...)
artifact.create_analysis_campaign(...)
artifact.get_analysis_campaign(...)
artifact.record_analysis_slice(...)

E. Paper lifecycle

artifact.submit_paper_outline(...)
artifact.list_paper_outlines(...)
artifact.submit_paper_bundle(...)

F. Reading and interaction continuity

artifact.arxiv(...)
artifact.interact(...)
artifact.complete_quest(...)

The most important artifact tool for long-running collaboration is:

artifact.interact(...)

Because it keeps together:

user-visible updates
mailbox polling
connector delivery
threaded continuity
attachment delivery

8.3 `bash_exec`

Purpose:

durable shell execution
monitored long runs
durable logs
stoppable and readable sessions

Current built-in tool:

bash_exec.bash_exec(...)

This one tool supports multiple modes:

detach
await
read
kill
list
history

The design rule is simple:

anything shell-like should go through bash_exec
do not hide important execution in transient shell snippets

9. How the three MCP namespaces divide responsibility

Use this mental model:

memory: remember
artifact: decide and record
bash_exec: run and monitor

Examples:

a reusable lesson from a failed run -> memory.write(...)
confirming a baseline -> artifact.confirm_baseline(...)
launching training -> bash_exec.bash_exec(mode='detach', ...)
notifying the user about the next checkpoint -> artifact.interact(...)

If you mix these roles badly, the quest becomes harder to resume and audit.

10. How one real turn usually works

A typical turn looks like this:

a user or connector message arrives
the daemon restores quest snapshot and history
PromptBuilder assembles the current turn prompt
the active skill defines the stage discipline
priority memory is injected
the agent uses memory, artifact, and bash_exec
outputs are persisted into files, artifacts, memory cards, logs, and Git state
artifact.interact(...) keeps the user-facing thread continuous

That is why DeepScientist feels more like a persistent workshop than a stateless chat.

11. When to change prompt, skill, or MCP code

Use this quick rule:

change src/prompts/system.md when the global operating stance is wrong
change src/prompts/contracts/shared_interaction.md when continuity behavior is wrong
change src/prompts/connectors/*.md when one connector behaves wrong
change src/skills/<skill>/SKILL.md when one stage behaves wrong
change src/deepscientist/prompts/builder.py when prompt assembly or runtime context selection is wrong
change src/deepscientist/mcp/server.py when the built-in tool surface itself is wrong

Do not use a giant prompt patch to fix a real MCP contract bug.

Do not use a new MCP tool to fix a stage-discipline problem that belongs in a skill.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

14 Prompt, Skills, and MCP Guide

1. One-sentence summary

2. What files are the main prompt truth sources

3. How one turn prompt is assembled

4. What the major prompt blocks actually do

4.1 `system.md`

4.2 `shared_interaction.md`

4.3 Active communication surface

4.4 Connector contract

4.5 Runtime context and durable quest state

4.6 Quest context

4.7 Research delivery policy

4.8 Interaction style

4.9 Priority memory

5. What can be overridden locally

6. How skills are structured

6.1 Standard stage skills

6.2 Companion skills

6.3 The important design point

7. What each skill usually leaves behind

8. Built-in MCP structure

8.1 `memory`

8.2 `artifact`

A. Generic durable records

B. Branch and route control

C. Baseline lifecycle

D. Experiment and analysis lifecycle

E. Paper lifecycle

F. Reading and interaction continuity

8.3 `bash_exec`

9. How the three MCP namespaces divide responsibility

10. How one real turn usually works

11. When to change prompt, skill, or MCP code

12. What to read next

FilesExpand file tree

14_PROMPT_SKILLS_AND_MCP_GUIDE.md

Latest commit

History

14_PROMPT_SKILLS_AND_MCP_GUIDE.md

File metadata and controls

14 Prompt, Skills, and MCP Guide

1. One-sentence summary

2. What files are the main prompt truth sources

3. How one turn prompt is assembled

4. What the major prompt blocks actually do

4.1 system.md

4.2 shared_interaction.md

4.3 Active communication surface

4.4 Connector contract

4.5 Runtime context and durable quest state

4.6 Quest context

4.7 Research delivery policy

4.8 Interaction style

4.9 Priority memory

5. What can be overridden locally

6. How skills are structured

6.1 Standard stage skills

6.2 Companion skills

6.3 The important design point

7. What each skill usually leaves behind

8. Built-in MCP structure

8.1 memory

8.2 artifact

A. Generic durable records

B. Branch and route control

C. Baseline lifecycle

D. Experiment and analysis lifecycle

E. Paper lifecycle

F. Reading and interaction continuity

8.3 bash_exec

9. How the three MCP namespaces divide responsibility

10. How one real turn usually works

11. When to change prompt, skill, or MCP code

12. What to read next

4.1 `system.md`

4.2 `shared_interaction.md`

8.1 `memory`

8.2 `artifact`

8.3 `bash_exec`