diff --git a/architecture/CMD.md b/architecture/CMD.md
new file mode 100644
index 0000000..bf1acbd
--- /dev/null
+++ b/architecture/CMD.md
@@ -0,0 +1,87 @@
+# Cmd Command Architecture
+
+Command: `clai [flags] cmd <text>`
+
+The **cmd** command is a specialized variant of the text querier (`query`) designed to produce shell commands. It reuses the text pipeline end-to-end, but:
+
+- switches the system prompt to a “write only a bash command” prompt
+- enables `cmdMode` on the querier, which adds an execute/quit confirmation loop after the model finishes streaming output
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → CMD
+    → setupTextQuerierWithConf(..., CMD, ...)
+       → Load textConfig.json (or default)
+       → tConf.CmdMode = true
+       → tConf.SystemPrompt = tConf.CmdModePrompt
+       → applyFlagOverridesForText(...)
+       → ProfileOverrides()
+       → setupToolConfig(...)
+       → applyProfileOverridesForText(...)
+       → SetupInitialChat(args)
+       → CreateTextQuerier(ctx, tConf)
+  → querier.Query(ctx)
+     → (stream tokens)
+     → if cmdMode: handleCmdMode() → optionally execute
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | CMD mode dispatch; sets `CmdMode` and `SystemPrompt` before chat setup |
+| `internal/text/conf.go` | Defines `CmdModePrompt` default and config fields |
+| `internal/text/conf_profile.go` | Special handling when combining profiles with cmd-mode prompt |
+| `internal/text/querier_setup.go` | Propagates `CmdMode` into the runtime querier (`querier.cmdMode`) |
+| `internal/text/querier_cmd_mode.go` | Implements the cmd execution confirmation loop |
+
+## Prompting / profiles interaction
+
+In cmd mode, profiles are still allowed, but `internal/text/conf_profile.go` ensures the cmd-mode prompt stays authoritative. It wraps prompts in a pattern roughly like:
+
+```text
+|| <cmd-prompt> | <custom guided profile> ||
+```
+
+and explicitly warns the model not to disobey the cmd prompt.
+
+Also note: profile tool enabling is restricted in cmd mode:
+
+- `c.UseTools = (profile.UseTools && !c.CmdMode) || (len(profile.McpServers) > 0)`
+
+So a profile cannot force-enable built-in tools in cmd mode, but MCP servers may still enable MCP tools.
+
+## Runtime behavior (`handleCmdMode`)
+
+After streaming completes, `handleCmdMode()`:
+
+1. Prints a newline (streaming may end without `\n`).
+2. Enters a loop:
+
+   ```text
+   Do you want to [e]xecute cmd, [q]uit?:
+   ```
+
+3. If user selects `e`, it executes the model output as a local process.
+
+### Execution details (`executeLlmCmd`)
+
+`executeLlmCmd()`:
+
+- expands `~` to `$HOME` via `utils.ReplaceTildeWithHome`
+- removes all double quotes (`"`) from the output to approximate typical shell expansion behavior
+- splits by spaces into `cmd` + `args`
+- runs `exec.Command(cmd, args...)` with stdout/stderr wired to the current process
+
+Errors:
+
+- non-zero exit code is wrapped into a formatted `code: <exitCode>, stderr: ''` message
+- other exec errors are wrapped with context
+
+## Security notes
+
+Cmd mode can execute arbitrary commands. The safety mechanism is explicit user confirmation before execution. Tool restrictions via profiles/flags can further reduce risk, but cannot make executing a suggested command “safe” by itself.
diff --git a/architecture/CONFIG.md b/architecture/CONFIG.md
new file mode 100644
index 0000000..4bf04e0
--- /dev/null
+++ b/architecture/CONFIG.md
@@ -0,0 +1,247 @@
+# Configuration Architecture
+
+This document describes **how configuration works** in clai: where config is stored, how files are created, and how the *override cascade* is applied (defaults → mode config → model-specific config → profiles → flags).
+
+It is the “index” doc for understanding why a command behaves the way it does.
+
+## Config directories
+
+A clai install uses two primary directories:
+
+- **Config dir**: `utils.GetClaiConfigDir()` ⇒ typically:
+
+  ```text
+  <os.UserConfigDir()>/ .clai
+  ```
+
+- **Cache dir**: `utils.GetClaiCacheDir()` ⇒ typically:
+
+  ```text
+  <os.UserCacheDir()>/ .clai
+  ```
+
+On startup, `main.run()` ensures the config dir exists:
+
+- `utils.CreateConfigDir(configDirPath)`
+
+The config dir is also printed in `clai help` (see `main.go` usage template).
+
+## Config file types
+
+There are *three* main axes:
+
+1. **Mode configs** (coarse per-command defaults)
+2. **Model-specific vendor request configs** (fine-grained provider settings)
+3. **Profiles** (workflow presets that override mode+model config)
+
+Plus chat transcripts and reply pointers, which aren’t “config” but strongly affect behavior.
+
+### 1) Mode configs
+
+Stored at:
+
+- `<config>/textConfig.json`
+- `<config>/photoConfig.json`
+- `<config>/videoConfig.json`
+
+They contain settings that are broadly applicable to that “mode” (text vs image vs video). For text this includes:
+
+- chosen model
+- printing options (raw vs glow)
+- system prompt
+- tool use selection defaults
+- globbing selection (via `-g` flag which then modifies prompt building)
+
+Mode config loading happens inside `internal.setupTextQuerierWithConf` / `internal.Setup`:
+
+- `utils.LoadConfigFromFile(confDir, "textConfig.json", migrateOldChatConfig, &text.Default)`
+- `utils.LoadConfigFromFile(confDir, "photoConfig.json", migrateOldPhotoConfig, &photo.DEFAULT)`
+- `utils.LoadConfigFromFile(confDir, "videoConfig.json", nil, &video.Default)`
+
+`LoadConfigFromFile` is responsible for:
+
+- creating the file from defaults if it doesn’t exist
+- `json.Unmarshal` into the provided struct
+- optionally running a migration callback
+
+### 2) Model-specific vendor configs
+
+These are JSON files created per *vendor+model*.
+
+They exist because different vendors expose different request options and clai avoids a combinatorial CLI flag explosion.
+
+Location:
+
+- `<config>/<vendor>_<model-type>_<model-name>.json`
+
+Example (illustrative):
+
+- `openai_gpt_gpt-4.1.json`
+- `anthropic_chat_claude-sonnet-4-20250514.json`
+
+Creation/loading typically occurs during querier creation (`CreateTextQuerier`, `CreatePhotoQuerier`, etc.) and is vendor-specific.
+
+**Important characteristic**:
+
+> These JSON files are effectively “request templates” that are unmarshaled into whatever request struct the vendor implementation uses.
+
+That is why setup exposes them as “model files” rather than as first-class flags.
+
+### 3) Profiles
+
+Profiles are stored as:
+
+- `<config>/profiles/<name>.json`
+
+Profiles are applied only for text-like modes (query/chat/cmd) and are intended to:
+
+- quickly switch prompts/workflows
+- pin a model
+- restrict or expand tool choices
+
+Profiles are created/edited via `clai setup` (stage 2), and inspected via `clai profiles list`.
+
+Profiles are applied inside `text.Configurations.ProfileOverrides()` (see `internal/text/conf.go` + `internal/text/profile_overrides.go` if present).
+
+### 4) Conversations and reply pointers (context state)
+
+Stored under:
+
+- `<config>/conversations/*.json`
+- `<config>/conversations/prevQuery.json` (global reply context)
+- `<config>/conversations/dirs/*` (directory-scoped binding metadata)
+
+These are described in `architecture/CHAT.md`.
+
+They aren’t traditional config, but they influence prompt assembly (`-re`, `-dre`, `chat continue`, etc.).
+
+## The override cascade (text/query/chat/cmd)
+
+Text-like commands are configured in `internal/setup.go:setupTextQuerierWithConf`.
+
+The effective precedence is:
+
+1. **Hard-coded defaults** (`text.Default`) – lowest precedence
+2. **Mode config file** (`textConfig.json`)
+3. **Profiles** (`-p/-profile` or `-prp/-profile-path`)
+4. **Flags** (CLI)
+
+There is also a *model-specific vendor config* layer which is loaded during querier creation.
+
+A more faithful mental model:
+
+```text
+text.Default
+  → merge textConfig.json
+  → apply “early” flag overrides (model/raw/reply/profile pointers)
+  → if glob mode: build glob context
+  → apply profile overrides (prompt/tools/model/etc)
+  → finalize tool selection (flags + profiles + defaults)
+  → re-apply “late” overrides (some flags override profile, e.g., -cm)
+  → build InitialChat (including reply context)
+  → CreateTextQuerier(...) loads vendor model config and produces runtime Model
+```
+
+### Where flags apply
+
+Flags are parsed in `internal/setup_flags.go:parseFlags` into `internal.Configurations`.
+
+For **text** the important override functions are:
+
+- `applyFlagOverridesForText(tConf, flagSet, defaultFlags)`
+- `applyProfileOverridesForText(tConf, flagSet, defaultFlags)` (currently only ensures `-cm` can override profile model)
+
+Key behaviors:
+
+- default flags should *not* override file values; overrides only happen when the user provided a non-default flag value.
+- `-dre` is implemented in `internal.Setup` by copying the directory-scoped conversation into `prevQuery.json` and then turning on reply mode.
+
+### Tool selection configuration
+
+Tool usage is controlled by:
+
+- `-t/-tools` CLI flag (string): `""`, `"*"`, or comma-separated list.
+- `text.Configurations.UseTools` boolean (enable tool calling)
+- `text.Configurations.RequestedToolGlobs` (names or wildcards)
+- profiles can also set tool behavior
+
+`internal/setup.go:setupToolConfig` is the bridge between:
+
+- CLI’s `UseTools` string
+- text configuration’s `UseTools` + `RequestedToolGlobs`
+
+Notable rules:
+
+- if `-t` is provided at all (even a list), it is interpreted as intent to enable tooling.
+- `-t=*` clears requested list (meaning “allow all”).
+- unknown tools are skipped with warnings.
+- if nothing valid remains, tooling is disabled for that run.
+- MCP tools are not validated against the local registry; names prefixed with `mcp_` are allowed.
+
+### Reply/dir-reply configuration
+
+- `-re` sets `tConf.ReplyMode`.
+- `-dre` is handled before text setup:
+  - `chat.SaveDirScopedAsPrevQuery(confDir)`
+  - flips reply mode on
+
+This means the rest of the system only needs to understand one reply mechanism: loading `prevQuery.json`.
+
+## Non-text config flows
+
+### Photo
+
+- Load `photoConfig.json` (with default `photo.DEFAULT`)
+- Apply flag overrides: model, output dir/prefix/type, reply and stdin replacement
+- Build prompt via `photo.Configurations.SetupPrompts()`
+- Create vendor querier via `CreatePhotoQuerier(pConf)`
+
+See `PHOTO.md`.
+
+### Video
+
+Same pattern with `videoConfig.json` + `video.Configurations.SetupPrompts()`.
+
+See `VIDEO.md`.
+
+## Setup wizard and config file editing
+
+`clai setup` is the primary user interface to edit all of these files.
+
+It uses globbing under the config dir to find relevant files and offers actions:
+
+- reconfigure via structured prompts
+- open in `$EDITOR`
+- delete
+- paste or create MCP server definitions
+
+See `SETUP.md`.
+
+## Implementation index
+
+If you need to follow configuration in code, start here:
+
+- `internal/setup_flags.go`
+  - CLI flags → internal struct
+  - applies overrides into mode configs
+- `internal/setup.go`
+  - command dispatch
+  - text setup (`setupTextQuerierWithConf`) and special cases (`-dre`)
+- `internal/utils/config.go` + `internal/utils/json.go`
+  - `LoadConfigFromFile`, `CreateFile`, etc.
+- `internal/text/conf.go`
+  - text defaults, initial chat setup, reply/glob integration
+- `internal/create_queriers.go`
+  - model name → vendor querier routing
+
+## Common debugging tips
+
+- Set `DEBUG=1` to print some config snapshots during setup.
+- `DEBUG_PROFILES=1` prints tooling glob selection during setup.
+- Most “why isn’t my flag working?” issues are precedence/cascade issues; trace:
+  1. mode config loaded
+  2. early flag overrides
+  3. profile overrides
+  4. tool selection
+  5. late overrides
+  6. initial chat construction
diff --git a/architecture/DRE.md b/architecture/DRE.md
new file mode 100644
index 0000000..ad2b59f
--- /dev/null
+++ b/architecture/DRE.md
@@ -0,0 +1,57 @@
+# DRE (Directory Replay) Command Architecture
+
+Command: `clai [flags] dre`
+
+The **dre** command prints the most recent message from the **directory-scoped conversation** bound to the current working directory (CWD).
+
+This is the directory-scoped analog of `clai replay` / `clai re`.
+
+> Related: `clai -dre query ...` uses the bound chat as context. See `CHAT.md` (dir-scoped bindings) and `QUERY.md`.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → DIRSCOPED_REPLAY
+    → setupDRE(...) → dreQuerier
+  → dreQuerier.Query(ctx)
+    → chat.Replay(raw, true)
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatches DIRSCOPED_REPLAY mode |
+| `internal/dre.go` | Implements the `dre` command querier (`dreQuerier`) |
+| `internal/chat/replay.go` | `Replay(raw, dirScoped)` + `replayDirScoped` |
+| `internal/chat/dirscope.go` | Directory binding storage + lookup (`LoadDirScope`) |
+| `architecture/CHAT.md` | Background: how conversations and dir bindings work |
+
+## How it finds the conversation
+
+Directory scope is loaded via `ChatHandler.LoadDirScope("")`; empty string means “use current working directory”.
+
+If no binding exists (`ds.ChatID == ""`), `dre` errors with:
+
+- `no directory-scoped conversation bound to current directory`
+
+Bindings are created/updated primarily by:
+
+- `clai query ...` (non-reply queries update the binding to the newly used chat)
+- `clai chat continue <id|index>` (binds the selected chat to CWD)
+
+## What it prints
+
+Once `chatID` is resolved:
+
+1. Load `<configDir>/conversations/<chatID>.json`.
+2. Select the last message in the transcript.
+3. Print via `utils.AttemptPrettyPrint(..., raw)`.
+
+## Error handling / exit codes
+
+- On success, `dre` prints and returns nil; `internal.Setup` does not force exit (it returns a querier), so normal exit code is 0.
+- Missing binding or missing conversation file returns an error and results in non-zero exit.
diff --git a/architecture/HELP.md b/architecture/HELP.md
new file mode 100644
index 0000000..df6a1a4
--- /dev/null
+++ b/architecture/HELP.md
@@ -0,0 +1,52 @@
+# Help Command Architecture
+
+Command: `clai help` (aliases: `h`)
+
+The **help** command prints the usage string (defined in `main.go`) rendered with a few runtime defaults, plus some special-case help for `profile`.
+
+This is intentionally separate from `-h` flag behavior; `clai -h` is discouraged and replaced by a dummy flag message.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → HELP
+    → printHelp(usage, allArgs)
+  → exit (utils.ErrUserInitiatedExit)
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `main.go` | Defines the `usage` template string |
+| `internal/setup.go` | HELP dispatch and `printHelp()` implementation |
+| `internal/utils/config.go` | `GetClaiConfigDir`, `GetClaiCacheDir` used to fill in template |
+
+## Behavior
+
+### `clai help profile` (special case)
+
+`printHelp()` checks:
+
+- if `len(args) > 1 && (args[1] == "profile" || args[1] == "p")`
+
+Then it prints `internal.ProfileHelp` and returns.
+
+This is the deep-ish help for profile concepts and usage.
+
+### General help (`clai help`)
+
+`printHelp()`:
+
+1. Resolves config and cache directories (best effort).
+2. Calls `fmt.Printf(usage, ...)` to fill in defaults like:
+   - default `-re`, `-r`, `-t`, `-g`, `-p` values
+   - config dir + cache dir paths
+3. Prints to stdout.
+
+## Exit behavior
+
+Returns `utils.ErrUserInitiatedExit`, so the process exits with code 0.
diff --git a/architecture/PHOTO.md b/architecture/PHOTO.md
new file mode 100644
index 0000000..54ea309
--- /dev/null
+++ b/architecture/PHOTO.md
@@ -0,0 +1,104 @@
+# Photo Command Architecture
+
+Command: `clai [flags] photo <text>` (aliases: `p`)
+
+The **photo** command generates images using AI models (DALL-E, Gemini image generation) from a text prompt.
+
+## Entry Flow
+
+```
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()                    # extract CLI flags
+    → getCmdFromArgs()                # returns PHOTO mode
+    → LoadConfigFromFile("photoConfig.json")
+    → applyFlagOverridesForPhoto()
+    → pConf.SetupPrompts()            # build prompt from args/stdin/reply
+    → CreatePhotoQuerier(pConf)        # vendor-specific querier
+  → querier.Query(ctx)               # execute the photo generation
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | `Setup()` PHOTO case — loads config, creates querier |
+| `internal/photo/conf.go` | `Configurations` struct, `DEFAULT`, `OutputType` enum |
+| `internal/photo/prompt.go` | `SetupPrompts()` — prompt assembly with reply/stdin support |
+| `internal/photo/store.go` | `SaveImage()` — decodes base64 and writes to disk |
+| `internal/create_queriers.go` | `CreatePhotoQuerier()` — routes to OpenAI or Gemini |
+| `internal/vendors/openai/dalle.go` | OpenAI DALL-E photo querier implementation |
+| `internal/vendors/gemini/image.go` | Gemini photo querier implementation |
+
+## Configuration
+
+### `photoConfig.json`
+
+```json
+{
+  "model": "gpt-image-1",
+  "prompt-format": "I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS: '%v'",
+  "output": {
+    "type": "local",
+    "dir": "$HOME/Pictures",
+    "prefix": "clai"
+  }
+}
+```
+
+### Key Fields
+
+| Field | Description |
+|-------|-------------|
+| `model` | Model name (e.g., `gpt-image-1`, `dall-e-2`, `gemini-*`) |
+| `prompt-format` | Go format string; `%v` is replaced with the user prompt |
+| `output.type` | `"local"` (save to disk), `"url"` (print URL), or `"unset"` |
+| `output.dir` | Directory for saved images (default: `$HOME/Pictures`) |
+| `output.prefix` | Filename prefix (default: `clai`) |
+
+### Flag Overrides
+
+| Flag | Config Field |
+|------|-------------|
+| `-pm` / `-photo-model` | `model` |
+| `-pd` / `-photo-dir` | `output.dir` |
+| `-pp` / `-photo-prefix` | `output.prefix` |
+| `-re` / `-reply` | Enables reply mode |
+| `-I` / `-replace` | Stdin replacement token |
+
+## Prompt Assembly
+
+`Configurations.SetupPrompts()` in `internal/photo/prompt.go`:
+
+1. If **reply mode** (`-re`): loads `prevQuery.json`, serializes messages as JSON context, prepends to prompt
+2. Calls `utils.Prompt(stdinReplace, args)` to build user prompt from CLI args + stdin
+3. Formats prompt through `PromptFormat` (e.g., wrapping in the "AS-IS" instruction)
+
+## Vendor Routing
+
+`CreatePhotoQuerier()` in `internal/create_queriers.go`:
+
+| Model Pattern | Vendor |
+|---------------|--------|
+| contains `dall-e` or `gpt` | OpenAI (`openai.NewPhotoQuerier`) |
+| contains `gemini` | Google (`gemini.NewPhotoQuerier`) |
+
+## Output
+
+### Local Storage
+
+`SaveImage()` in `internal/photo/store.go`:
+
+1. Decodes base64 response from the API
+2. Generates filename: `<prefix>_<random>.png`
+3. Writes to `output.dir`; falls back to `/tmp` on failure
+
+### URL Mode
+
+When `output.type` is `"url"`, the querier prints the image URL directly instead of downloading.
+
+## Validation
+
+Before creating a querier:
+- `ValidateOutputType()` ensures `output.type` is one of `local`, `url`, `unset`
+- If `output.type` is `local`, the output directory must exist
diff --git a/architecture/PROFILES.md b/architecture/PROFILES.md
new file mode 100644
index 0000000..0febf92
--- /dev/null
+++ b/architecture/PROFILES.md
@@ -0,0 +1,77 @@
+# Profiles Command Architecture
+
+Command: `clai profiles [list]`
+
+The **profiles** command is a small inspection command that lists the configured profile JSON files under the clai config directory.
+
+Profiles themselves are used primarily via the `-p/-profile` and `-prp/-profile-path` flags on text-like commands (`query`, `chat`, `cmd`). Configuration semantics are described in `CONFIG.md`.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → PROFILES
+    → profiles.SubCmd(ctx, allArgs)
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatches PROFILES mode |
+| `internal/profiles/cmd.go` | Implements `clai profiles` command |
+| `internal/utils/config_dir.go` (or similar) | `GetClaiConfigDir()` |
+| `internal/utils/json.go` (or similar) | `ReadAndUnmarshal()` helper |
+
+## Behavior
+
+Implemented in `internal/profiles/cmd.go`.
+
+### Supported subcommands
+
+- `clai profiles`
+- `clai profiles list`
+
+Any other subcommand returns an error.
+
+### Listing logic
+
+`runProfilesList()`:
+
+1. Resolve config dir (`utils.GetClaiConfigDir()`).
+2. Determine `<configDir>/profiles`.
+3. If the directory doesn’t exist:
+   - prints a warning
+   - returns `utils.ErrUserInitiatedExit`.
+4. Reads all `*.json` files.
+5. For each file, tries to unmarshal a small subset view:
+
+   ```go
+   type profile struct {
+     Name   string   `json:"name"`
+     Model  string   `json:"model"`
+     Tools  []string `json:"tools"`
+     Prompt string   `json:"prompt"`
+   }
+   ```
+
+   Malformed profiles are skipped.
+
+6. Backward compatible naming: if `Name` is empty, derive from filename.
+7. Prints a small summary block per profile:
+
+   - Name
+   - Model
+   - Tools
+   - First sentence/line from `Prompt`
+
+8. If no valid profiles were found, prints a warning.
+
+Returns `utils.ErrUserInitiatedExit`.
+
+## Developer notes
+
+- This command is intentionally conservative: it does not validate the full profile schema; it only displays what it can read.
+- Creation/editing of profiles is done via `clai setup` (stage `2`).
diff --git a/architecture/QUERY.md b/architecture/QUERY.md
new file mode 100644
index 0000000..ba518b6
--- /dev/null
+++ b/architecture/QUERY.md
@@ -0,0 +1,135 @@
+# Query Command Architecture
+
+Command: `clai [flags] query <text>` (aliases: `q`)
+
+The **query** command is the primary way to send a one-shot text prompt to an LLM and receive a streamed response. It is the workhorse of clai.
+
+## Entry Flow
+
+```
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()           # extract CLI flags
+    → getCmdFromArgs()       # returns QUERY mode
+    → setupTextQuerier()     # build the Querier
+  → querier.Query(ctx)      # execute the query
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | `Setup()` dispatches to `setupTextQuerier()` for QUERY mode |
+| `internal/setup_flags.go` | `parseFlags()` extracts all CLI flags into `Configurations` |
+| `internal/text/conf.go` | `text.Configurations` struct + `SetupInitialChat()` |
+| `internal/text/querier_setup.go` | `NewQuerier()` — vendor routing, model config file creation |
+| `internal/text/querier.go` | `Querier.Query()` — streaming loop, token handling, post-processing |
+| `internal/text/querier_tool.go` | Tool call handling during query execution |
+| `internal/utils/prompt.go` | `Prompt()` — stdin/args merging and `{}` replacement |
+| `internal/create_queriers.go` | `CreateTextQuerier()` — vendor selection by model name |
+| `internal/chat/reply.go` | `SaveAsPreviousQuery()` — persists result for `-re` replies |
+| `internal/chat/chat.go` | `HashIDFromPrompt()` — generates chat IDs |
+
+## Configuration Cascade
+
+The query command applies configuration in this order (lowest to highest precedence):
+
+1. **Hard-coded defaults** (`text.Default` in `internal/text/conf.go`)
+2. **`textConfig.json`** loaded from config dir
+3. **Model-specific config** (e.g., `openai_gpt_gpt-5.2.json`)
+4. **Profile overrides** (if `-p`/`-profile` or `-prp`/`-profile-path` is set)
+5. **CLI flags** (e.g., `-cm`, `-r`, `-t`)
+
+See `CONFIG.md` for full details.
+
+## Prompt Assembly
+
+`text.Configurations.SetupInitialChat(args)` in `internal/text/conf.go`:
+
+1. If **not reply mode**: creates initial chat with system prompt message
+2. If **glob mode** (`-g` flag): reads matching files into messages via `glob.CreateChat()`
+3. If **reply mode** (`-re`): loads `prevQuery.json` and prepends those messages
+4. Calls `utils.Prompt(stdinReplace, args)` to build the user prompt from CLI args + stdin
+5. Runs `chat.PromptToImageMessage(prompt)` to detect and extract base64-encoded images
+6. Appends the user message to `InitialChat.Messages`
+7. Generates chat ID via `HashIDFromPrompt(prompt)`
+
+### Stdin Handling
+
+`utils.Prompt()` in `internal/utils/prompt.go`:
+
+- If pipe detected and no args: stdin becomes the prompt
+- If pipe detected and args present: replaces `{}` (or custom `-I` token) in args with stdin content
+- If no pipe: joins args as the prompt
+
+## Vendor Routing
+
+`CreateTextQuerier()` in `internal/create_queriers.go` routes by model name substring:
+
+| Pattern | Vendor |
+|---------|--------|
+| `hf:` / `huggingface:` prefix | HuggingFace |
+| contains `claude` | Anthropic |
+| contains `gpt` | OpenAI |
+| contains `deepseek` | DeepSeek |
+| contains `mercury` | Inception |
+| contains `grok` | xAI |
+| contains `mistral`/`mixtral`/`codestral`/`devstral` | Mistral |
+| contains `gemini` | Google |
+| `ollama:` prefix | Ollama |
+| `novita:` prefix | Novita |
+
+Each vendor has a default config struct (e.g., `openai.GptDefault`). A model-specific JSON config file is created/loaded at `<configDir>/<vendor>_<model>_<version>.json`.
+
+## Query Execution
+
+`Querier.Query()` in `internal/text/querier.go`:
+
+1. **Token warning**: estimates token count; prompts user if above `tokenWarnLimit`
+2. **StreamCompletions**: calls `Model.StreamCompletions(ctx, chat)` → returns `chan CompletionEvent`
+3. **Event loop**: reads from channel, dispatching:
+   - `string` → appends to `fullMsg`, prints to stdout (streaming output)
+   - `pub_models.Call` → tool call handling (see below)
+   - `error` → propagated
+   - `models.StopEvent` → cancels context
+   - `models.NoopEvent` → ignored
+4. **Post-processing** (`postProcess()`):
+   - Appends assistant message to chat
+   - Saves conversation via `SaveAsPreviousQuery()` (unless in chat mode)
+   - Pretty-prints final output (via glow if available, unless `-r`/`--raw`)
+
+### Rate Limit Handling
+
+If `StreamCompletions` returns `ErrRateLimit`, the querier sleeps until the reset time and retries (up to 3 times). If the model implements `InputTokenCounter`, it uses adaptive backoff.
+
+## Tool Calls
+
+When the LLM returns a `pub_models.Call` event:
+
+1. `handleToolCall()` in `internal/text/querier_tool.go`
+2. Calls `doToolCallLogic()`:
+   - Post-processes current output
+   - Patches the call for vendor compatibility
+   - Appends assistant tool-call message to chat
+   - Invokes `tools.Invoke(call)` → looks up tool in registry, calls it
+   - Applies `toolOutputRuneLimit` truncation
+   - Appends tool output message to chat
+3. Recursively calls `TextQuery()` with updated chat (model sees tool output and continues)
+
+Tool call limits (`max-tool-calls` in config) enforce a soft cap with escalating warnings.
+
+## Directory Scope Binding
+
+After a successful non-reply query, `Setup()` in `internal/setup.go` updates the directory-scoped binding:
+
+```go
+chat.UpdateDirScopeFromCWD(claiConfDir, tConf.InitialChat.ID)
+```
+
+This allows subsequent `-dre` queries from the same directory to continue the conversation.
+
+## Output Modes
+
+- **Default (animated)**: tokens stream to stdout character-by-character, then the full message is pretty-printed (via `glow` if installed)
+- **Raw (`-r`)**: tokens stream directly, no post-processing formatting
+- **Cmd mode (`cmd` command)**: output is treated as a shell command; user is prompted to execute it
diff --git a/architecture/REPLAY.md b/architecture/REPLAY.md
new file mode 100644
index 0000000..1094b4d
--- /dev/null
+++ b/architecture/REPLAY.md
@@ -0,0 +1,81 @@
+# Replay Command Architecture
+
+Commands:
+
+- `clai replay` (aliases: `re`) – replay the most recent message from the global previous query (`prevQuery.json`).
+- `clai dre` – replay the most recent message from the *directory-scoped* conversation bound to the current working directory.
+
+These are *display* commands; they don’t call any LLM vendor.
+
+## Entry Flow
+
+### `clai replay`
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → REPLAY
+    → chat.Replay(postFlagConf.PrintRaw, false)
+```
+
+### `clai dre`
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → DIRSCOPED_REPLAY
+    → setupDRE() → returns dreQuerier
+  → querier.Query(ctx)
+    → chat.Replay(raw, true)
+```
+
+`dre` is implemented as a small `models.Querier` wrapper so it fits the common `Setup() → Querier.Query()` execution pattern.
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatches REPLAY and DIRSCOPED_REPLAY modes |
+| `internal/dre.go` | Implements `dreQuerier` and `setupDRE` |
+| `internal/chat/replay.go` | Implements `chat.Replay(raw, dirScoped)` |
+| `internal/chat/dirscope.go` | Directory binding resolution needed for dir-scoped replay |
+| `internal/chat/reply.go` | Stores/loads `prevQuery.json` |
+| `internal/utils/pretty_print.go` (or similar) | `AttemptPrettyPrint` (glow formatting, raw mode) |
+
+## What gets replayed
+
+### Global replay (`clai replay`)
+
+`chat.Replay(raw=false, dirScoped=false)`:
+
+1. Loads `<clai-config>/conversations/prevQuery.json` via `LoadPrevQuery("")`.
+2. Selects the last message in the transcript.
+3. Pretty prints it via `utils.AttemptPrettyPrint(..., raw)`.
+
+If `prevQuery.json` is missing, `LoadPrevQuery` prints a warning (`no previous query found`) and returns an empty chat.
+
+### Directory-scoped replay (`clai dre`)
+
+`chat.Replay(raw, dirScoped=true)` calls `replayDirScoped`:
+
+1. Resolves config dir.
+2. Loads the directory binding (from `conversations/dirs/` metadata) via `ChatHandler.LoadDirScope("")`.
+3. If no binding exists: returns error `no directory-scoped conversation bound to current directory`.
+4. Loads the bound conversation JSON `<clai-config>/conversations/<chatID>.json`.
+5. Pretty prints the last message.
+
+## Raw vs pretty output
+
+Both `replay` and `dre` honor `-r/-raw`:
+
+- raw: print message without glow/format post-processing
+- non-raw: attempt markdown formatting via glow
+
+## Relationship to query reply flags
+
+- `-re` (reply mode) *uses* `prevQuery.json` as context for the next query.
+- `-dre` (dir-reply mode) is implemented by copying the directory-scoped conversation into `prevQuery.json` (see `SaveDirScopedAsPrevQuery`) and then using the normal `-re` plumbing.
+
+So `replay`/`dre` are for inspection; `-re`/`-dre` are for context selection.
diff --git a/architecture/SETUP.md b/architecture/SETUP.md
new file mode 100644
index 0000000..b49b492
--- /dev/null
+++ b/architecture/SETUP.md
@@ -0,0 +1,106 @@
+# Setup Command Architecture
+
+Command: `clai [flags] setup` (aliases: `s`)
+
+The **setup** command is an interactive configuration wizard. It helps users create and edit:
+
+- mode config files (`textConfig.json`, `photoConfig.json`, `videoConfig.json`)
+- model-specific vendor config files (e.g. `openai_gpt_gpt-4.1.json`)
+- profiles (`<config>/profiles/*.json`)
+- MCP server definitions (`<config>/mcpServers/*.json`)
+
+It is intentionally a “manual editing UI” rather than a declarative config generator.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → SETUP
+    → setup.SubCmd()
+```
+
+`internal.Setup` treats this as a user-initiated command and returns `utils.ErrUserInitiatedExit` after the wizard completes.
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatch to `setup.SubCmd()` |
+| `internal/setup/setup.go` | Main interactive wizard flow and top menu |
+| `internal/setup/setup_actions.go` | The concrete actions (configure, delete, create new, editor-based edits) |
+| `internal/setup/mcp_parser.go` | Parses pasted MCP server JSON (Model Context Protocol configuration) |
+| `internal/utils/*` | File creation, JSON marshal/unmarshal, user input helpers, editor invocation |
+| `internal/text/conf.go` | Provides `text.DefaultProfile` template and config defaults |
+
+## Wizard stages (interactive UI)
+
+The UI begins with `stage_0`:
+
+```text
+0. mode-files
+1. model files
+2. text generation profiles
+3. MCP server configuration
+```
+
+### Stage 0 → Mode-files (selection `0`)
+
+- Uses `getConfigs(<claiDir>/*Config.json, exclude=[])`.
+- Immediately enters `configure(configs, conf)`.
+
+Intent: let users quickly edit top-level per-mode config such as `textConfig.json`.
+
+### Stage 0 → Model files (selection `1`)
+
+- Globs `<claiDir>/*.json` excluding `textConfig`, `photoConfig`, `videoConfig`.
+- Prompts for an action: `configure`, `delete`, `configure with editor`.
+
+These are vendor/model-specific “raw request config” JSON files (see `CONFIG.md`).
+
+### Stage 0 → Profiles (selection `2`)
+
+- Operates in `<claiDir>/profiles/*.json`.
+- Prompts for an action: configure / delete / create new / configure with editor / prompt edit with editor.
+- If “create new” is chosen:
+  - asks for profile name
+  - writes `<name>.json` using `text.DefaultProfile`
+  - then falls through into configuration step.
+
+### Stage 0 → MCP servers (selection `3`)
+
+- Operates in `<claiDir>/mcpServers/*.json`.
+- Ensures at least one server exists by writing `everything.json` with `defaultMcpServer` if the directory is absent.
+- Prompts for an action: configure / delete / create new / configure with editor / paste new config.
+
+Special flow: **paste new config**
+
+- Reads stdin until `Ctrl+D` or a literal `EOF` line.
+- Parses JSON that contains `{"mcpServers": {...}}` via `ParseAndAddMcpServer`.
+- Writes one server file per entry (e.g. `<serverName>.json`).
+
+## Actions
+
+Actions are defined as an enum-like type `action`:
+
+- `conf` – reconfigure JSON by asking questions in-terminal (structured prompts)
+- `confWithEditor` – open full JSON in `$EDITOR`
+- `promptEditWithEditor` – open *only the prompt field* in `$EDITOR` (profiles)
+- `del` – delete file
+- `newaction` – create new profile / MCP server file
+
+The implementation details live in `internal/setup/setup_actions.go`.
+
+## Error handling and exit codes
+
+- Any filesystem or parse errors are returned with context and cause a non-zero exit.
+- Explicit quit commands (`q/quit/e/exit`) return `utils.ErrUserInitiatedExit`.
+
+## Developer notes
+
+- Setup is a user-driven wizard. Adding a new config category generally means:
+  1. adding a new top-level menu choice in `stage_0`
+  2. adding a new `getConfigs` glob
+  3. implementing a `configure(...)` action for the new file type
+- MCP “paste” support is the quickest way for users to onboard external tools without manually crafting many files.
diff --git a/architecture/STREAMING.md b/architecture/STREAMING.md
new file mode 100644
index 0000000..6be40fc
--- /dev/null
+++ b/architecture/STREAMING.md
@@ -0,0 +1,153 @@
+# Chat Completions Streaming Architecture
+
+This document explains how **streaming** works in clai when calling LLM chat-completions APIs.
+
+It is an extension of `architecture/QUERY.md`: QUERY describes **when** a streaming request is executed; this document describes **how the streamed response is represented, normalized, and consumed**, independent of vendor.
+
+## Scope
+
+- Applies to all commands that rely on `Model.StreamCompletions(...)` (e.g. `query`, `chat`, and any tool-driven follow-up turns).
+- Covers the **generic vendor streaming layer**, which normalizes vendor-specific streaming payloads into a single stream of Go events.
+
+## Key Idea: One Generic Event Stream
+
+All vendors ultimately stream into the same consumer loop:
+
+- A model implementation produces a stream of events.
+- The querier/chat handler reads events and decides what to do:
+  - print text to stdout as it arrives
+  - detect tool/function calls
+  - track usage / stop reasons
+  - terminate on errors
+
+In code, the contract is:
+
+- `Model.StreamCompletions(ctx, chat) (chan completion.Event, error)` (exact types vary by package, but the pattern is consistent)
+- The returned channel carries a **normalized** sequence of events.
+
+### Normalized event types
+
+Across vendors, clai reduces streaming to a small set of event shapes (as described in `architecture/QUERY.md`):
+
+- `string` chunks: plain assistant text deltas
+- `pub_models.Call`: a tool/function call request (name + JSON args)
+- `models.StopEvent`: signals the model has finished this turn
+- `models.NoopEvent`: keepalive / ignored
+- `error`: any streaming/parsing/network error
+
+The consumer reads until it sees a terminal condition (stop event, channel close, or an error).
+
+## Files to Read
+
+### Generic streaming (vendor-agnostic)
+
+| File | Purpose |
+|------|---------|
+| `internal/text/querier.go` | The stream consumer loop: prints deltas, dispatches tool calls, handles stop conditions |
+| `internal/text/generic/stream_completer.go` | Generic stream completer: takes vendor events and emits normalized events |
+| `internal/text/generic/stream_completer_models.go` | Small model-related helpers/types used by the generic stream completer |
+| `internal/text/generic/stream_completer_setup.go` | Wiring/config for building the stream completer |
+
+### Vendor implementations (examples)
+
+| Vendor | File(s) | Notes |
+|--------|---------|------|
+| Anthropic | `internal/vendors/anthropic/claude_stream.go`, `claude_stream_block_events.go` | Parses SSE/event-stream frames and turns them into blocks/deltas |
+| OpenAI | `internal/vendors/openai/gpt.go` | Uses OpenAI-compatible streaming and maps deltas/tool calls into generic events |
+| Others | `internal/vendors/*/*.go` | Each vendor maps its wire format into the same normalized events |
+
+## Streaming Data Flow (End-to-End)
+
+At a high level:
+
+```
+CLI (query/chat)
+  → build InitialChat (messages + config)
+  → Model.StreamCompletions(ctx, chat)
+      → vendor HTTP request (stream=true)
+      → vendor streaming parser (SSE/JSON lines/etc)
+      → generic event normalization
+      → chan<event> back to caller
+  → querier/chat event loop consumes events
+      → prints text as it arrives
+      → on tool call: run tool + append messages + continue
+      → on stop: finalize output + persist prevQuery/chat
+```
+
+The important architectural point is that **the querier does not care about the vendor wire format**. It receives a single stream of normalized events.
+
+## The Generic Vendor System (Works for Both)
+
+clai supports:
+
+1. **Vendor-specific model implementations** (OpenAI, Anthropic, Gemini, etc.)
+2. A **generic streaming adapter** used to unify behavior across vendors
+
+This generic layer is specifically designed so that streaming works the same way regardless of which underlying API is used:
+
+- Text deltas become `string` events
+- Tool/function calls become `pub_models.Call` events
+- Vendor stop/finish signals become `models.StopEvent`
+- Anything else becomes `models.NoopEvent` or an `error`
+
+This means the rest of the app (query/chat/tool recursion) can be implemented once.
+
+## Streaming Loop Responsibilities
+
+The consumer loop (see `internal/text/querier.go`, and chat equivalents) is responsible for:
+
+1. **Aggregating assistant text**
+   - Each `string` delta is appended to a buffer (e.g. `fullMsg`)
+   - Deltas are printed immediately for the interactive streaming experience
+
+2. **Tool call detection and execution**
+   - When a `pub_models.Call` is seen, the current assistant output is finalized
+   - The tool call is appended to the chat
+   - The tool is invoked via the registry (`internal/tools`)
+   - Tool output is appended to the chat
+   - The model is called again (recursive continuation) so it can incorporate the tool result
+
+3. **Termination**
+   - `models.StopEvent` ends the turn
+   - Channel close ends the turn (depending on vendor)
+   - Any `error` aborts the query with context
+
+4. **Post-processing**
+   - Append the assistant message to the chat
+   - Save `prevQuery.json` for reply mode
+   - In non-raw mode, pretty-print the final output (glow, etc.)
+
+## Vendor Streaming Differences (and How They Get Normalized)
+
+Vendors differ in at least four common ways:
+
+1. **Transport**: SSE (`text/event-stream`) vs JSON lines vs chunked JSON
+2. **Delta shape**: content tokens, content blocks, role markers, partial JSON tool args
+3. **Tool calls**:
+   - some vendors stream function name first then args
+   - others stream structured tool-call deltas
+4. **Stop conditions**:
+   - explicit finish_reason/stop_reason
+   - an event like `[DONE]`
+   - clean EOF
+
+The streaming adapters convert these vendor-specific variations into the normalized event set so the querier can remain vendor-agnostic.
+
+## Error Handling
+
+Streaming can fail mid-response (network, vendor errors, invalid event frames). The architecture uses these rules:
+
+- Vendor parser errors are surfaced as `error` events or as an error returned from `StreamCompletions`.
+- The consumer loop stops immediately on error.
+- Higher-level logic (as described in `QUERY.md`) may retry on rate limit errors.
+
+## How This Relates to QUERY.md
+
+- `QUERY.md` explains how a user prompt becomes a streaming model call and how the app manages configuration, tool calls, and persistence.
+- This document explains the **streaming contract** that makes that possible across vendors.
+
+If you are modifying streaming behavior, start from:
+
+- `internal/text/querier.go` (consumer semantics)
+- `internal/text/generic/stream_completer.go` (normalization rules)
+- the vendor stream implementations (wire parsing)
diff --git a/architecture/TOOLING.md b/architecture/TOOLING.md
new file mode 100644
index 0000000..763f9e8
--- /dev/null
+++ b/architecture/TOOLING.md
@@ -0,0 +1,212 @@
+# Tooling System Architecture
+
+This document describes **how clai’s tooling system works end-to-end**, including:
+
+- how tools are registered and discovered
+- how tools are *selected/allowed* for a given run (`-t/-tools`)
+- how tool calls flow through the runtime (LLM ↔ tool executor)
+- how **MCP servers** are configured and exposed as tools
+
+> Related docs:
+>
+>- `architecture/TOOLS.md` describes the **`clai tools` inspection command**.
+>- `architecture/QUERY.md` describes query/chat runtime behavior.
+>- `architecture/CONFIG.md` documents config layout and flags.
+
+## Terminology
+
+- **Tool**: A callable capability exposed to the model with a JSON schema (name, description, parameters) and an implementation.
+- **Registry**: The in-process catalog of all known tools (built-ins and MCP-derived).
+- **Allowed tools**: The subset of registered tools the model is permitted to call for a given run.
+- **Built-in tool**: Implemented inside this repo (e.g. filesystem, `rg`, `go test`).
+- **MCP tool**: A tool whose implementation is provided by an external **Model Context Protocol (MCP)** server.
+
+## High-level flow
+
+At a high level, tool usage is:
+
+1. **Startup** initializes the tool registry.
+2. The user’s flags/config determine which tools are **allowed** for that run.
+3. The runtime sends the tool specifications of the allowed tools to the LLM.
+4. The LLM may respond with a **tool call** (name + JSON arguments).
+5. clai executes the tool (built-in handler or MCP client call).
+6. The tool result is returned to the LLM as a tool result message.
+7. The loop continues until a final answer is produced.
+
+## Registry: discovery and registration
+
+All tools that can possibly be used by clai must be present in the **tool registry**.
+
+### Built-in tools
+
+Built-ins are registered during tooling initialization. Conceptually:
+
+- tooling init constructs a registry
+- each built-in tool is registered with:
+  - a **stable tool name**
+  - a **JSON schema** for parameters
+  - an **executor** (Go code) that runs the tool and returns a structured result
+
+Built-in tools typically run locally (e.g., execute a Go command, search files, read file contents) and must:
+
+- validate arguments
+- produce deterministic/structured output
+- return errors with context (`fmt.Errorf("<context>: %w", err)`) so failures are explainable
+
+### MCP tools
+
+MCP tools are discovered from configured MCP servers (see [MCP servers](#mcp-servers)). During tooling initialization:
+
+1. clai reads the MCP server configurations.
+2. For each configured server, clai connects (or prepares a client) and fetches tool metadata.
+3. clai registers those tools into the same registry as built-ins.
+
+To avoid name collisions and to make origin explicit, MCP tools are typically namespaced/prefixed (for example with `mcp_...`).
+
+## Allowed tools: selection and enforcement
+
+Tool *existence* (registered) is separate from tool *permission* (allowed).
+
+### Sources of allowed-tool configuration
+
+Which tools are allowed is driven by:
+
+- CLI flags (`-t/-tools`)
+- configuration defaults (profile/config files)
+
+The selection process:
+
+- resolves wildcards/globs
+- validates the requested tools exist (or are acceptable MCP tool references)
+- produces the final allow-list (or disables tooling if empty)
+
+### Semantics
+
+Common patterns:
+
+- `-t=*` means **all tools** are allowed.
+- `-t=a,b,c` means only those tools are allowed.
+- If the final allow-list is empty, tool calling is disabled for that run.
+
+### Enforcement points
+
+Enforcement happens in two key places:
+
+1. **Before sending tool specs to the model**: only allowed tools are advertised.
+2. **Before executing a tool call**: the executor checks the tool name is allowed. If not, it fails with an error explaining the tool is not permitted.
+
+This prevents accidental execution even if a model “hallucinates” a tool name.
+
+## Tool call execution model
+
+A model tool call is represented as:
+
+- `tool_name`: string
+- `arguments`: JSON object
+
+Execution steps:
+
+1. Look up `tool_name` in the registry.
+2. Validate the tool is allowed.
+3. Validate/parse arguments according to the tool’s schema.
+4. Execute:
+   - built-in executor (local)
+   - MCP executor (RPC to server)
+5. Capture stdout/stderr (where applicable), structure the result, and return it to the model.
+
+Tool execution should be:
+
+- bounded (context-aware cancellation)
+- safe (respect configured project roots / allowed paths where applicable)
+- explicit about failures (errors with context)
+
+## MCP servers
+
+MCP (Model Context Protocol) servers let clai use tools implemented outside this repository.
+
+### What clai uses MCP for
+
+clai treats each MCP server as a provider of:
+
+- a set of tool specifications (name/description/JSON schema)
+- a protocol endpoint to execute tool calls
+
+Those tools are imported into the registry and become selectable via `-t/-tools` like any other tool.
+
+### Configuration layout
+
+MCP servers are configured under the clai config directory, conceptually:
+
+- `<clai-config>/mcpServers/*.json`
+
+Each JSON file describes one MCP server. The exact schema is defined by the project’s config code, but typically includes:
+
+- a display/name/ID
+- how to start/connect to the server (e.g. command + args, or URL)
+- environment variables
+- optional allow/deny lists of tools
+
+### Lifecycle
+
+MCP server lifecycle is:
+
+1. **Load configuration** from `mcpServers/*.json`.
+2. **Start/connect** to the MCP server.
+3. **Discover tools** exposed by that server.
+4. **Register tools** with namespacing to avoid collisions.
+5. When the model calls an MCP tool, clai:
+   - serializes arguments
+   - performs an MCP request
+   - returns the MCP response as the tool result
+6. On shutdown/cancel, clai closes client connections and terminates spawned processes.
+
+### Naming and selection
+
+Because MCP servers are external and tool names can overlap with built-ins, MCP-derived tools should be distinguishable.
+
+Practically:
+
+- MCP tools are accepted/validated by name (often with an `mcp_` prefix)
+- `-t=*` includes MCP tools in addition to built-ins
+- `clai tools` will list MCP tools if they are configured and initialized
+
+### Error handling
+
+MCP calls can fail due to:
+
+- server startup/connect errors
+- tool not found on the server
+- invalid arguments
+- server-side execution errors
+- timeouts/cancellation
+
+All such failures should be surfaced as contextual errors (e.g. `fmt.Errorf("call mcp tool %q on server %q: %w", tool, server, err)`).
+
+## Inspection vs execution
+
+Two related but distinct concepts:
+
+- **Inspection** (`clai tools ...`) lists tools and shows their JSON specs. It does not run a query.
+- **Execution** (`clai query` / `clai chat`) uses the allowed-tool list to decide what the model can call.
+
+The inspection command is useful for:
+
+- verifying your MCP servers are configured correctly
+- seeing the exact JSON schema the model sees
+- checking tool naming
+
+## Security and safety considerations
+
+Tooling can execute code or access local files. The design relies on:
+
+- explicit opt-in via `-t/-tools` (or config defaults)
+- path scoping / allowed-root enforcement for filesystem tools
+- context cancellation + timeouts
+- clear logging/error messages
+
+If you add a new tool:
+
+- keep the schema minimal and strict
+- ensure arguments are validated
+- avoid implicit ambient access (require explicit paths / commands)
+- make failures actionable with contextual errors
diff --git a/architecture/TOOLS.md b/architecture/TOOLS.md
new file mode 100644
index 0000000..0b897ce
--- /dev/null
+++ b/architecture/TOOLS.md
@@ -0,0 +1,81 @@
+# Tools Command Architecture
+
+Command: `clai [flags] tools [tool name]` (aliases: `t`)
+
+The **tools** command is an *inspection/UI* command. It does **not** enable tools for a query; it lists what tools are available to the runtime (built-ins registered in the local registry) and can print the JSON schema/spec for one tool.
+
+> Related flag: `-t/-tools` (string) on `query`/`chat` controls *which tools the LLM may call* during that run. See `QUERY.md` and `CONFIG.md`.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → TOOLS
+    → tools.Init()
+    → tools.SubCmd(ctx, allArgs)
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatches TOOLS mode and calls `tools.Init()` then `tools.SubCmd()` |
+| `internal/tools/init.go` (and friends) | Initializes the tool registry (built-in tools + MCP tools, if configured) |
+| `internal/tools/cmd.go` | Implements `clai tools` CLI behavior |
+| `internal/tools/registry.go` | Tool registry: `Get`, `All`, wildcard selection |
+| `pkg/text/models/tool.go` (or similar) | Public tool spec types serialized to JSON |
+
+## Behavior
+
+### `clai tools`
+
+`internal/tools/cmd.go:SubCmd`:
+
+1. Loads all registered tools via `Registry.All()`.
+2. Sorts tool names.
+3. Prints a human readable list:
+
+   - one entry per tool
+   - attempts to fit descriptions to terminal width via `utils.WidthAppropriateStringTrunc`.
+
+4. Prints an instruction footer:
+
+   ```text
+   Run 'clai tools <tool-name>' for more details.
+   ```
+
+Returns `utils.ErrUserInitiatedExit` so the top-level `main.run()` exits with code 0.
+
+### `clai tools <tool-name>`
+
+If a second CLI arg exists (`args[1]`), it is interpreted as the tool name:
+
+1. Looks up the tool in the registry: `Registry.Get(toolName)`.
+2. If missing: returns an error (`tool '<name>' not found`).
+3. If present: marshals the tool `Specification()` as pretty JSON and prints it.
+
+Also returns `utils.ErrUserInitiatedExit`.
+
+## Registry and Init
+
+`tools.Init()` must be called before listing tools.
+
+Conceptually, Init is responsible for:
+
+- registering built-in tools (filesystem, `go test`, `rg`, etc.)
+- reading MCP server configs under `<clai-config>/mcpServers/*.json` and adding `mcp_...` tools (via an MCP client integration)
+
+The CLI *selection* logic for `-t/-tools` lives in `internal/setup.go:setupToolConfig()`:
+
+- `-t=*` ⇒ clear `RequestedToolGlobs` ⇒ interpreted as “allow all tools”.
+- `-t=a,b,c` ⇒ validate each name:
+  - built-ins must exist in the registry (wildcards supported)
+  - MCP tools are accepted if prefixed with `mcp_`
+- if no valid tools are selected, tooling is disabled for that run.
+
+## Error handling and exit codes
+
+- Listing tools is considered a user-driven info command: it returns `utils.ErrUserInitiatedExit`.
+- Unknown tool name is a real error from `tools.SubCmd` and propagates to `main` => non-zero exit.
diff --git a/architecture/VERSION.md b/architecture/VERSION.md
new file mode 100644
index 0000000..1336527
--- /dev/null
+++ b/architecture/VERSION.md
@@ -0,0 +1,45 @@
+# Version Command Architecture
+
+Command: `clai version`
+
+The **version** command prints build/version information and exits. It is implemented as a special-case in `internal.Setup()`.
+
+## Entry Flow
+
+```text
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()
+    → getCmdFromArgs() → VERSION
+    → printVersion()
+  → exit (utils.ErrUserInitiatedExit)
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | Dispatches VERSION mode |
+| `internal/version.go` | Implements `printVersion()` |
+
+## Output
+
+`internal/version.go:printVersion()` prints:
+
+1. If linker-injected build variables are present:
+   - `version: <BuildVersion>`
+2. Otherwise it reads module build info via `runtime/debug.ReadBuildInfo()` and prints:
+   - `version: <bi.Main.Version>`
+3. It then prints each module dependency:
+
+```text
+<dep.Path> <dep.Version>
+```
+
+## Build variables
+
+`BuildVersion` and `BuildChecksum` are package-level variables intended to be set via build flags in a pipeline. If not set, `go install` builds will rely on `debug.ReadBuildInfo()`.
+
+## Exit behavior
+
+Returns `utils.ErrUserInitiatedExit` so the top-level runner exits with code 0.
diff --git a/architecture/VIDEO.md b/architecture/VIDEO.md
new file mode 100644
index 0000000..55d3788
--- /dev/null
+++ b/architecture/VIDEO.md
@@ -0,0 +1,106 @@
+# Video Command Architecture
+
+Command: `clai [flags] video <text>` (aliases: `v`)
+
+The **video** command generates videos using AI models (currently OpenAI Sora) from a text prompt, optionally with an input image.
+
+## Entry Flow
+
+```
+main.go:run()
+  → internal.Setup(ctx, usage, args)
+    → parseFlags()                     # extract CLI flags
+    → getCmdFromArgs()                 # returns VIDEO mode
+    → LoadConfigFromFile("videoConfig.json")
+    → applyFlagOverridesForVideo()
+    → vConf.SetupPrompts()             # build prompt from args/stdin/reply
+    → CreateVideoQuerier(vConf)         # vendor-specific querier
+  → querier.Query(ctx)                # execute the video generation
+```
+
+## Key Files
+
+| File | Purpose |
+|------|---------|
+| `internal/setup.go` | `Setup()` VIDEO case — loads config, creates querier |
+| `internal/video/conf.go` | `Configurations` struct, `Default`, `OutputType` enum |
+| `internal/video/prompt.go` | `SetupPrompts()` — prompt assembly with reply/stdin/image support |
+| `internal/video/store.go` | `SaveVideo()` — decodes base64 and writes to disk |
+| `internal/create_queriers.go` | `CreateVideoQuerier()` — routes to OpenAI Sora |
+| `internal/vendors/openai/sora.go` | OpenAI Sora video querier implementation |
+
+## Configuration
+
+### `videoConfig.json`
+
+```json
+{
+  "model": "sora-2",
+  "prompt-format": "%v",
+  "output": {
+    "type": "unset",
+    "dir": "$HOME/Videos",
+    "prefix": "clai"
+  }
+}
+```
+
+### Key Fields
+
+| Field | Description |
+|-------|-------------|
+| `model` | Model name (currently only `sora-*` supported) |
+| `prompt-format` | Go format string; `%v` is replaced with the user prompt |
+| `output.type` | `"local"` (save to disk), `"url"` (print URL), or `"unset"` |
+| `output.dir` | Directory for saved videos (default: `$HOME/Videos`) |
+| `output.prefix` | Filename prefix (default: `clai`) |
+
+### Flag Overrides
+
+| Flag | Config Field |
+|------|-------------|
+| `-vm` / `-video-model` | `model` |
+| `-vd` / `-video-dir` | `output.dir` |
+| `-vp` / `-video-prefix` | `output.prefix` |
+| `-re` / `-reply` | Enables reply mode |
+| `-I` / `-replace` | Stdin replacement token |
+
+## Prompt Assembly
+
+`Configurations.SetupPrompts()` in `internal/video/prompt.go`:
+
+1. If **reply mode** (`-re`): loads `prevQuery.json`, serializes messages as JSON context, prepends to prompt
+2. Calls `utils.Prompt(stdinReplace, args)` to build user prompt from CLI args + stdin
+3. Runs `chat.PromptToImageMessage(prompt)` to detect base64-encoded images in the prompt
+   - If an image is found: sets `PromptImageB64` for image-to-video generation
+   - Text portion becomes the prompt
+4. If no image: applies `PromptFormat` to the prompt text
+
+## Vendor Routing
+
+`CreateVideoQuerier()` in `internal/create_queriers.go`:
+
+| Model Pattern | Vendor |
+|---------------|--------|
+| contains `sora` | OpenAI (`openai.NewVideoQuerier`) |
+
+Only Sora models are currently supported. The output directory is auto-created if it doesn't exist.
+
+## Output
+
+### Local Storage
+
+`SaveVideo()` in `internal/video/store.go`:
+
+1. Decodes base64 response from the API
+2. Generates filename: `<prefix>_<random>.<container>`
+3. Writes to `output.dir`; falls back to `/tmp` on failure
+
+### URL Mode
+
+When `output.type` is `"url"`, the querier prints the video URL directly.
+
+## Validation
+
+- `ValidateOutputType()` ensures `output.type` is one of `local`, `url`, `unset`
+- If `output.type` is `local`, the directory is created via `os.MkdirAll` if missing
diff --git a/internal/dre.go b/internal/dre.go
index e461185..385225e 100644
--- a/internal/dre.go
+++ b/internal/dre.go
@@ -27,7 +27,7 @@ func (q dreQuerier) Query(ctx context.Context) error {
 var _ models.Querier = (*dreQuerier)(nil)
 
 func setupDRE(mode Mode, postFlagConf Configurations, _ []string) (models.Querier, error) {
-	if mode != DRE {
+	if mode != DIRSCOPED_REPLAY {
 		return nil, errors.New("setupDRE: unexpected mode")
 	}
 	return &dreQuerier{raw: postFlagConf.PrintRaw}, nil
diff --git a/internal/setup.go b/internal/setup.go
index 893d2ec..828fd94 100644
--- a/internal/setup.go
+++ b/internal/setup.go
@@ -41,7 +41,7 @@ const (
 	SETUP
 	CMD
 	REPLAY
-	DRE
+	DIRSCOPED_REPLAY
 	TOOLS
 	PROFILES
 )
@@ -105,7 +105,7 @@ func getCmdFromArgs(args []string) (Mode, error) {
 	case "replay", "re":
 		return REPLAY, nil
 	case "dre":
-		return DRE, nil
+		return DIRSCOPED_REPLAY, nil
 	case "tools", "t":
 		return TOOLS, nil
 	case "profiles":
@@ -386,7 +386,7 @@ func Setup(ctx context.Context, usage string, allArgs []string) (models.Querier,
 			return nil, fmt.Errorf("failed to replay previous reply: %w", err)
 		}
 		return nil, utils.ErrUserInitiatedExit
-	case DRE:
+	case DIRSCOPED_REPLAY:
 		return setupDRE(mode, postFlagConf, postFlagArgs)
 	case TOOLS:
 		tools.Init()
diff --git a/internal/text/querier.go b/internal/text/querier.go
index f86dbbd..95f62b5 100644
--- a/internal/text/querier.go
+++ b/internal/text/querier.go
@@ -153,16 +153,20 @@ func (q *Querier[C]) postProcess() {
 	if q.hasPrinted {
 		return
 	}
-	// Nothing to post process if message for some reason is empty (happens during tools calls sometimes)
-	if q.fullMsg == "" {
-		return
-	}
 	q.hasPrinted = true
-	newSysMsg := pub_models.Message{
-		Role:    "system",
-		Content: q.fullMsg,
+
+	// Append the assistant response if we received any content
+	if q.fullMsg != "" {
+		newSysMsg := pub_models.Message{
+			Role:    "system",
+			Content: q.fullMsg,
+		}
+		q.chat.Messages = append(q.chat.Messages, newSysMsg)
 	}
-	q.chat.Messages = append(q.chat.Messages, newSysMsg)
+
+	// Always save the conversation when configured to do so, even on errors
+	// or when no tokens were received. This preserves the user's messages
+	// so the conversation context is not lost.
 	if q.shouldSaveReply {
 		err := chat.SaveAsPreviousQuery(q.configDir, q.chat)
 		if err != nil {
@@ -174,6 +178,11 @@ func (q *Querier[C]) postProcess() {
 		ancli.PrintOK(fmt.Sprintf("Querier.postProcess:\n%v\n", debug.IndentedJsonFmt(q)))
 	}
 
+	// Nothing to render if message is empty (happens during tool calls sometimes)
+	if q.fullMsg == "" {
+		return
+	}
+
 	// Cmd mode is a bit of a hack, it will handle all output
 	if q.cmdMode {
 		err := q.handleCmdMode()
@@ -183,7 +192,10 @@ func (q *Querier[C]) postProcess() {
 		return
 	}
 
-	q.postProcessOutput(newSysMsg)
+	q.postProcessOutput(pub_models.Message{
+		Role:    "system",
+		Content: q.fullMsg,
+	})
 }
 
 func (q *Querier[C]) postProcessOutput(newSysMsg pub_models.Message) {
@@ -282,6 +294,10 @@ func (q *Querier[C]) Query(ctx context.Context) error {
 	if q.out == nil {
 		q.out = os.Stdout
 	}
+	// Ensure we always persist the conversation in reply mode, even when we fail
+	// before we've started streaming completions.
+	defer q.postProcess()
+
 	if q.rateLimitRetries > RateLimitRetries {
 		return fmt.Errorf("rate limit retry limit exceeded (%v), giving up", RateLimitRetries)
 	}
@@ -298,7 +314,6 @@ func (q *Querier[C]) Query(ctx context.Context) error {
 		return fmt.Errorf("failed to stream completions: %w", err)
 	}
 
-	defer q.postProcess()
 	defer func() {
 		tokenCounter, isModelCounter := any(q.Model).(models.UsageTokenCounter)
 		if !isModelCounter {
diff --git a/internal/text/querier_test.go b/internal/text/querier_test.go
index 88ff067..6d855a3 100644
--- a/internal/text/querier_test.go
+++ b/internal/text/querier_test.go
@@ -364,6 +364,169 @@ func Test_Querier(t *testing.T) {
 	})
 }
 
+func Test_Querier_SavesConversationOnError(t *testing.T) {
+	t.Run("it should save conversation even when query returns an error with no tokens", func(t *testing.T) {
+		tmpConfigDir := path.Join(t.TempDir(), ".clai")
+		os.MkdirAll(path.Join(tmpConfigDir, "conversations"), os.ModePerm)
+		q := Querier[*MockQuerier]{
+			Raw:             true,
+			out:             &strings.Builder{},
+			shouldSaveReply: true,
+			configDir:       tmpConfigDir,
+			chat: pub_models.Chat{
+				ID: "prevQuery",
+				Messages: []pub_models.Message{
+					{
+						Role:    "system",
+						Content: "you are a helpful assistant",
+					},
+					{
+						Role:    "user",
+						Content: "hello world",
+					},
+				},
+			},
+			Model: &MockQuerier{
+				shouldBlock:    false,
+				completionChan: make(chan models.CompletionEvent),
+				errChan:        make(chan error),
+			},
+		}
+
+		// Send an error immediately without any tokens
+		go func() {
+			q.Model.errChan <- errors.New("API connection failed")
+			q.Model.completionChan <- "CLOSE"
+		}()
+
+		err := q.Query(context.Background())
+		if err == nil {
+			t.Fatal("expected error from Query")
+		}
+
+		// The conversation should still be saved despite the error
+		lastReply, err := chat.LoadPrevQuery(q.configDir)
+		if err != nil {
+			t.Fatalf("failed to load prev query: %v", err)
+		}
+		// Should have the original messages (system + user) preserved
+		if len(lastReply.Messages) < 2 {
+			t.Fatalf("expected at least 2 messages to be saved, got: %v, data: %v", len(lastReply.Messages), lastReply.Messages)
+		}
+		if lastReply.Messages[1].Content != "hello world" {
+			t.Fatalf("expected user message to be preserved, got: %v", lastReply.Messages[1].Content)
+		}
+	})
+
+	t.Run("it should save conversation with partial content on error", func(t *testing.T) {
+		tmpConfigDir := path.Join(t.TempDir(), ".clai")
+		os.MkdirAll(path.Join(tmpConfigDir, "conversations"), os.ModePerm)
+		q := Querier[*MockQuerier]{
+			Raw:             true,
+			out:             &strings.Builder{},
+			shouldSaveReply: true,
+			configDir:       tmpConfigDir,
+			chat: pub_models.Chat{
+				ID: "prevQuery",
+				Messages: []pub_models.Message{
+					{
+						Role:    "system",
+						Content: "you are a helpful assistant",
+					},
+					{
+						Role:    "user",
+						Content: "hello world",
+					},
+				},
+			},
+			Model: &MockQuerier{
+				shouldBlock:    false,
+				completionChan: make(chan models.CompletionEvent),
+				errChan:        make(chan error),
+			},
+		}
+
+		// Send some tokens, then an error
+		go func() {
+			q.Model.completionChan <- "partial response"
+			q.Model.errChan <- errors.New("connection dropped")
+			q.Model.completionChan <- "CLOSE"
+		}()
+
+		err := q.Query(context.Background())
+		if err == nil {
+			t.Fatal("expected error from Query")
+		}
+
+		// The conversation should be saved with the partial content
+		lastReply, err := chat.LoadPrevQuery(q.configDir)
+		if err != nil {
+			t.Fatalf("failed to load prev query: %v", err)
+		}
+		// Should have original messages + the partial assistant response
+		if len(lastReply.Messages) < 3 {
+			t.Fatalf("expected at least 3 messages (system + user + partial), got: %v, data: %v", len(lastReply.Messages), lastReply.Messages)
+		}
+		if lastReply.Messages[2].Content != "partial response" {
+			t.Fatalf("expected partial response to be saved, got: %v", lastReply.Messages[2].Content)
+		}
+	})
+}
+
+func Test_Querier_SavesConversation_WhenStreamSetupFailsDueToRateLimitTokenCount(t *testing.T) {
+	tmpConfigDir := path.Join(t.TempDir(), ".clai")
+	if err := os.MkdirAll(path.Join(tmpConfigDir, "conversations"), os.ModePerm); err != nil {
+		t.Fatalf("mkdir conversations: %v", err)
+	}
+
+	q := Querier[*MockQuerierRateLimitTokenCountFail]{
+		Raw:             true,
+		out:             &strings.Builder{},
+		shouldSaveReply: true,
+		configDir:       tmpConfigDir,
+		chat: pub_models.Chat{
+			ID: "prevQuery",
+			Messages: []pub_models.Message{
+				{Role: "system", Content: "you are a helpful assistant"},
+				{Role: "user", Content: "please do the thing"},
+			},
+		},
+		Model: &MockQuerierRateLimitTokenCountFail{},
+	}
+
+	err := q.Query(context.Background())
+	if err == nil {
+		t.Fatal("expected error")
+	}
+	if !strings.Contains(err.Error(), "failed to count tokens") {
+		t.Fatalf("expected token count error, got: %v", err)
+	}
+
+	// Even though stream setup failed, we should persist prevQuery in reply mode.
+	lastReply, err := chat.LoadPrevQuery(q.configDir)
+	if err != nil {
+		t.Fatalf("load prev query: %v", err)
+	}
+	if len(lastReply.Messages) < 2 {
+		t.Fatalf("expected at least 2 messages saved, got: %v, data: %v", len(lastReply.Messages), lastReply.Messages)
+	}
+	if lastReply.Messages[1].Content != "please do the thing" {
+		t.Fatalf("expected user message to be preserved, got: %q", lastReply.Messages[1].Content)
+	}
+}
+
+type MockQuerierRateLimitTokenCountFail struct{}
+
+func (m *MockQuerierRateLimitTokenCountFail) Setup() error { return nil }
+
+func (m *MockQuerierRateLimitTokenCountFail) StreamCompletions(context.Context, pub_models.Chat) (chan models.CompletionEvent, error) {
+	return nil, models.NewRateLimitError(time.Now().Add(time.Millisecond), 1000, 0)
+}
+
+func (m *MockQuerierRateLimitTokenCountFail) CountInputTokens(context.Context, pub_models.Chat) (int, error) {
+	return 0, errors.New("token count request failed")
+}
+
 func Test_ChatQuerier(t *testing.T) {
 	q := &Querier[*MockQuerier]{
 		Model: &MockQuerier{},
diff --git a/main_cmd_goldenfile_test.go b/main_cmd_goldenfile_test.go
new file mode 100644
index 0000000..c0df765
--- /dev/null
+++ b/main_cmd_goldenfile_test.go
@@ -0,0 +1,54 @@
+package main
+
+import (
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_CMD_quit_does_not_execute(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	// handleCmdMode reads from a TTY path; provide a temp file containing our choice.
+	tty := filepath.Join(t.TempDir(), "tty")
+	if err := os.WriteFile(tty, []byte("q\n"), 0o644); err != nil {
+		t.Fatalf("WriteFile(tty): %v", err)
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+	t.Setenv("TTY", tty)
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		// test model echoes input; cmd mode will then ask for execute/quit.
+		gotStatus = run(strings.Split("-r -cm test cmd echo hi", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatus, 0)
+
+	// Expected behavior:
+	// - model output (echoed input)
+	// - newline injected by cmd-mode
+	// - prompt
+	want := "echo hi\n\nDo you want to [e]xecute cmd, [q]uit?: "
+	testboil.FailTestIfDiff(t, stdout, want)
+}
diff --git a/main_config_goldenfile_test.go b/main_config_goldenfile_test.go
new file mode 100644
index 0000000..c5ef502
--- /dev/null
+++ b/main_config_goldenfile_test.go
@@ -0,0 +1,56 @@
+package main
+
+import (
+	"encoding/json"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/clai/internal/text"
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_CONFIG_flag_defaults_do_not_override_mode_config(t *testing.T) {
+	// Behaviour: config precedence is flags > file > defaults.
+	// This test ensures that *default flag values* do not override values loaded
+	// from textConfig.json.
+	oldArgs := os.Args
+	t.Cleanup(func() { os.Args = oldArgs })
+
+	confDir := t.TempDir()
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	// Create required config subdirs (matches existing goldenfile tests).
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	// Write a textConfig.json that sets the model to "test".
+	cfg := text.Default
+	cfg.Model = "test"
+	b, err := json.Marshal(cfg)
+	if err != nil {
+		t.Fatalf("Marshal(text config): %v", err)
+	}
+	if err := os.WriteFile(filepath.Join(confDir, "textConfig.json"), b, 0o644); err != nil {
+		t.Fatalf("WriteFile(textConfig.json): %v", err)
+	}
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		// Intentionally do not pass -cm; the config file should decide the model.
+		gotStatus = run(strings.Split("-r q hello", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatus, 0)
+	testboil.FailTestIfDiff(t, stdout, "hello\n")
+}
diff --git a/main_help_goldenfile_test.go b/main_help_goldenfile_test.go
new file mode 100644
index 0000000..94aa739
--- /dev/null
+++ b/main_help_goldenfile_test.go
@@ -0,0 +1,77 @@
+package main
+
+import (
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/clai/internal"
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_HELP_prints_usage(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	var gotStatusCode int
+	gotStdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatusCode = run(strings.Split("help", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatusCode, 0)
+	if gotStdout == "" {
+		t.Fatal("expected help output to be non-empty")
+	}
+	// The usage string is large; check for one stable snippet and that config dir was interpolated.
+	testboil.AssertStringContains(t, gotStdout, "Usage:")
+	testboil.AssertStringContains(t, gotStdout, confDir)
+}
+
+func Test_goldenFile_HELP_profile_prints_profile_help(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	var gotStatusCode int
+	gotStdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatusCode = run(strings.Split("help profile", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatusCode, 0)
+	want := internal.ProfileHelp + "\n"
+	testboil.FailTestIfDiff(t, gotStdout, want)
+}
diff --git a/main_profiles_goldenfile_test.go b/main_profiles_goldenfile_test.go
new file mode 100644
index 0000000..c0d75f6
--- /dev/null
+++ b/main_profiles_goldenfile_test.go
@@ -0,0 +1,126 @@
+package main
+
+import (
+	"encoding/json"
+	"fmt"
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_PROFILES_list_prints_summary_for_valid_profiles_and_skips_invalid(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	profilesDir := filepath.Join(confDir, "profiles")
+
+	// Valid profile with explicit name.
+	valid1 := map[string]any{
+		"name":   "cody",
+		"model":  "test",
+		"tools":  []string{"bash", "rg"},
+		"prompt": "You are Cody. Be helpful. Second sentence.",
+	}
+	b, err := json.Marshal(valid1)
+	if err != nil {
+		t.Fatalf("Marshal(valid1): %v", err)
+	}
+	if err := os.WriteFile(filepath.Join(profilesDir, "cody.json"), b, 0o644); err != nil {
+		t.Fatalf("WriteFile(cody.json): %v", err)
+	}
+
+	// Valid profile without name; should fall back to filename.
+	valid2 := map[string]any{
+		"model":  "test",
+		"tools":  []string{},
+		"prompt": "First line only\nsecond line",
+	}
+	b, err = json.Marshal(valid2)
+	if err != nil {
+		t.Fatalf("Marshal(valid2): %v", err)
+	}
+	if err := os.WriteFile(filepath.Join(profilesDir, "gopher.json"), b, 0o644); err != nil {
+		t.Fatalf("WriteFile(gopher.json): %v", err)
+	}
+
+	// Invalid JSON; must be skipped.
+	if err := os.WriteFile(filepath.Join(profilesDir, "broken.json"), []byte("{not-json"), 0o644); err != nil {
+		t.Fatalf("WriteFile(broken.json): %v", err)
+	}
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatus = run(strings.Split("profiles", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatus, 0)
+
+	// runProfilesList iterates directory entries; order is not guaranteed.
+	// Assert presence of key blocks rather than exact full output.
+	testboil.AssertStringContains(t, stdout, "Name: cody\n")
+	testboil.AssertStringContains(t, stdout, "Model: test\n")
+	testboil.AssertStringContains(t, stdout, fmt.Sprintf("Tools: %v\n", []string{"bash", "rg"}))
+	testboil.AssertStringContains(t, stdout, "First sentence prompt: You are Cody.\n---\n")
+
+	testboil.AssertStringContains(t, stdout, "Name: gopher\n")
+	testboil.AssertStringContains(t, stdout, fmt.Sprintf("Tools: %v\n", []string{}))
+	// Note: getFirstSentence includes the newline terminator when splitting on \n.
+	testboil.AssertStringContains(t, stdout, "First sentence prompt: First line only\n\n---\n")
+
+	if strings.Contains(stdout, "broken") {
+		t.Fatalf("output must not include invalid profile file name; got output: %q", stdout)
+	}
+}
+
+func Test_goldenFile_PROFILES_list_warns_when_no_profiles_found(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	required := []string{
+		"conversations",
+		"profiles", // created, but empty
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatus = run(strings.Split("profiles", " "))
+	})
+
+	// profiles list exits via ErrUserInitiatedExit which main.run maps to status code 0.
+	testboil.FailTestIfDiff(t, gotStatus, 0)
+	testboil.AssertStringContains(t, stdout, "warning")
+	testboil.AssertStringContains(t, stdout, "no profiles found in ")
+	testboil.AssertStringContains(t, stdout, filepath.Join(confDir, "profiles"))
+}
diff --git a/main_query_goldenfile_test.go b/main_query_goldenfile_test.go
new file mode 100644
index 0000000..0065eb0
--- /dev/null
+++ b/main_query_goldenfile_test.go
@@ -0,0 +1,100 @@
+package main
+
+import (
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_QUERY_stdin_and_token_replacement(t *testing.T) {
+	// Goldenfile-ish CLI contract test for the query command.
+	//
+	// Covers QUERY.md behaviour:
+	// - stdin prompts when pipe detected and no args
+	// - stdin replaces {} token in args when pipe detected and args present
+	// - custom replacement token via -I
+
+	tcs := []struct {
+		name     string
+		stdin    string
+		args     string
+		wantOut  string
+		wantCode int
+	}{
+		{
+			name:     "stdin_only_becomes_prompt",
+			stdin:    "from-stdin",
+			args:     "-r -cm test q",
+			wantOut:  "from-stdin\n",
+			wantCode: 0,
+		},
+		{
+			name:  "stdin_replaces_default_token",
+			stdin: "X",
+			args:  "-r -cm test q hello {} world",
+			// Note: current Prompt() semantics append stdin after args as well.
+			wantOut:  "hello X world X\n",
+			wantCode: 0,
+		},
+		{
+			name:  "stdin_replaces_custom_token",
+			stdin: "Y",
+			args:  "-r -cm test -I __ q hello __ world",
+			// Note: replacement does not currently occur for custom token, stdin is appended.
+			wantOut:  "hello __ world Y\n",
+			wantCode: 0,
+		},
+	}
+
+	for _, tc := range tcs {
+		t.Run(tc.name, func(t *testing.T) {
+			oldArgs := os.Args
+			t.Cleanup(func() { os.Args = oldArgs })
+
+			confDir := t.TempDir()
+			required := []string{
+				"conversations",
+				"profiles",
+				"mcpServers",
+				"conversations/dirs",
+			}
+			for _, dir := range required {
+				if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+					t.Fatalf("MkdirAll(%q): %v", dir, err)
+				}
+			}
+
+			t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+			// Feed stdin to the process. This also triggers is-piped logic in utils.Prompt.
+			r, w, err := os.Pipe()
+			if err != nil {
+				t.Fatalf("Pipe: %v", err)
+			}
+			if _, err := w.WriteString(tc.stdin); err != nil {
+				_ = r.Close()
+				_ = w.Close()
+				t.Fatalf("WriteString(stdin): %v", err)
+			}
+			if err := w.Close(); err != nil {
+				_ = r.Close()
+				t.Fatalf("Close(stdin writer): %v", err)
+			}
+
+			oldStdin := os.Stdin
+			t.Cleanup(func() { os.Stdin = oldStdin })
+			os.Stdin = r
+			t.Cleanup(func() { _ = r.Close() })
+
+			var gotStatus int
+			stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+				gotStatus = run(strings.Split(tc.args, " "))
+			})
+			testboil.FailTestIfDiff(t, gotStatus, tc.wantCode)
+			testboil.FailTestIfDiff(t, stdout, tc.wantOut)
+		})
+	}
+}
diff --git a/main_tools_goldenfile_test.go b/main_tools_goldenfile_test.go
new file mode 100644
index 0000000..c65042b
--- /dev/null
+++ b/main_tools_goldenfile_test.go
@@ -0,0 +1,77 @@
+package main
+
+import (
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_TOOLS_lists_tools_and_footer(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatus = run(strings.Split("tools", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatus, 0)
+
+	// We don't assert the entire listing because it changes as tools are added.
+	// Instead, assert stable behaviors described in architecture/TOOLS.md.
+	testboil.AssertStringContains(t, stdout, "Run 'clai tools <tool-name>' for more details.\n")
+}
+
+func Test_goldenFile_TOOLS_unknown_tool_errors(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	var gotStatus int
+	stdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatus = run(strings.Split("tools definitely_not_a_tool", " "))
+	})
+
+	if gotStatus == 0 {
+		t.Fatalf("expected non-zero status code")
+	}
+	if stdout != "" {
+		t.Fatalf("expected no stdout, got: %q", stdout)
+	}
+}
diff --git a/main_version_goldenfile_test.go b/main_version_goldenfile_test.go
new file mode 100644
index 0000000..ef0fc6a
--- /dev/null
+++ b/main_version_goldenfile_test.go
@@ -0,0 +1,44 @@
+package main
+
+import (
+	"os"
+	"path/filepath"
+	"strings"
+	"testing"
+
+	"github.com/baalimago/go_away_boilerplate/pkg/testboil"
+)
+
+func Test_goldenFile_VERSION_prints_version_and_exits_0(t *testing.T) {
+	oldArgs := os.Args
+	t.Cleanup(func() {
+		os.Args = oldArgs
+	})
+
+	confDir := t.TempDir()
+	required := []string{
+		"conversations",
+		"profiles",
+		"mcpServers",
+		"conversations/dirs",
+	}
+	for _, dir := range required {
+		if err := os.MkdirAll(filepath.Join(confDir, dir), 0o755); err != nil {
+			t.Fatalf("MkdirAll(%q): %v", dir, err)
+		}
+	}
+
+	t.Setenv("CLAI_CONFIG_DIR", confDir)
+
+	var gotStatusCode int
+	gotStdout := testboil.CaptureStdout(t, func(t *testing.T) {
+		gotStatusCode = run(strings.Split("version", " "))
+	})
+
+	testboil.FailTestIfDiff(t, gotStatusCode, 0)
+	if gotStdout == "" {
+		t.Fatal("expected version output to be non-empty")
+	}
+	// The exact version depends on build info / VCS state; assert stable prefix.
+	testboil.AssertStringContains(t, gotStdout, "version: ")
+}