Skip to content

fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3#549

Merged
jamesadevine merged 9 commits into
mainfrom
copilot/fix-symlink-exfiltration
May 17, 2026
Merged

fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3#549
jamesadevine merged 9 commits into
mainfrom
copilot/fix-symlink-exfiltration

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 15, 2026

Stage 3's create-pr file collection used is_file() (follows symlinks), allowing an agent to plant a symlink (e.g. ln -s /proc/self/environ secrets.txt) that would be read and uploaded to ADO as PR file content — exfiltrating SYSTEM_ACCESSTOKEN and any other Stage 3 environment secrets.

Changes

Runtime fix — symlink-blind file collection (collect_changes_from_worktree, collect_changes_from_diff_tree)

All worktree is_file() call sites now go through a single helper, push_file_change_skipping_symlinks, that uses tokio::fs::symlink_metadata() (lstat(2)) so symlinks are detected without ever being followed:

  • Regular file → read and push the change as before.
  • Symlink → emit a warn! log, record the path, and skip.
  • Directory / fifo / socket → silently skip.
  • Metadata read error (e.g. permission denied) → warn! and skip.

The two collect_changes_from_* helpers now return (Vec<change>, Vec<skipped_symlink_path>) so the executor can surface the skipped list back to the PR author.

PR-description surfacing of skipped symlinks

If any symlinks were skipped during collection, an explicit GitHub-flavoured > [!WARNING] block listing each skipped path is appended to the PR description before it is posted. The agent that produced the PR therefore sees that some intended content was deliberately dropped for safety, rather than the silent omission previously only visible in Stage 3 infrastructure logs.

Belt-and-suspenders — patch validation (validate_patch_paths)

validate_patch_paths now rejects patches whose trimmed line is exactly:

  • new file mode 120000 — freshly added symlink
  • new mode 120000 — existing file converted to a symlink

Exact-line equality (rather than starts_with) prevents any ambiguity from hypothetical future mode strings sharing the 120000 prefix. old mode 120000 is deliberately allowed: it appears in legitimate symlink→regular-file conversions and never produces a symlink in the resulting worktree.

Refactor — collapse 7 duplicated match blocks

The PR originally introduced 7 nearly identical symlink_metadata match arms across the two collect functions. These now live in a single push_file_change_skipping_symlinks helper, with the no-follow invariant enforced in one place.

Test plan

New unit tests:

  • test_validate_patch_paths_symlink_rejectednew file mode 120000 patch rejected.
  • test_validate_patch_paths_symlink_mode_change_rejected — file→symlink mode change rejected.
  • test_validate_patch_paths_symlink_to_file_allowed — symlink→regular file conversion allowed.
  • test_validate_patch_paths_symlink_deletion_allowed — symlink deletion allowed.
  • test_validate_patch_paths_symlink_mode_suffix_not_bypass — trailing-whitespace and 120000-in-body cases neither bypass nor false-reject.
  • test_append_skipped_symlink_notice_empty_is_passthrough — empty skipped list leaves description untouched.
  • test_append_skipped_symlink_notice_lists_paths — non-empty skipped list produces a properly formatted warning block.

cargo test — all 48 create_pull_request unit tests pass; full cargo test --bin ado-aw (1510 tests) green; cargo clippy --all-targets clean.

Copilot AI changed the title [WIP] Fix symlink vulnerability in create-pr patch fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3 May 15, 2026
Copilot AI requested a review from jamesadevine May 15, 2026 06:25
@jamesadevine
Copy link
Copy Markdown
Collaborator

@copilot Fix up the duplicate code here

@jamesadevine
Copy link
Copy Markdown
Collaborator

@copilot there's a lot of duplication in this PR - can we sort that out?

Copilot AI and others added 2 commits May 17, 2026 18:18
Replace is_file() (follows symlinks) with symlink_metadata() checks in
collect_changes_from_worktree and collect_changes_from_diff_tree so
that symlinks in the applied patch are silently skipped rather than
followed to arbitrary filesystem paths (e.g. /proc/self/environ).

Also add symlink mode (120000) detection to validate_patch_paths as a
belt-and-suspenders defence: patches that create or convert files to
symlinks are rejected before git apply is attempted.

Adds two unit tests covering the new rejection paths.

Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/99d4d7f5-eaea-4a34-87e3-ac34bab53ddb

Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
@jamesadevine jamesadevine force-pushed the copilot/fix-symlink-exfiltration branch from 711e24f to 08e1fc2 Compare May 17, 2026 17:20
Collapse 7 duplicated symlink_metadata match blocks introduced by the symlink-exfiltration fix into a single helper. Same security semantics: regular file → push change; symlink → warn and skip; other → ignore. Centralizes the use of symlink_metadata so the no-follow invariant lives in one place.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jamesadevine jamesadevine marked this pull request as ready for review May 17, 2026 17:24
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Looks good — the security fix is correct and well-structured. Two minor suggestions below.

Findings

⚠️ Suggestions

  • src/safeoutputs/create_pull_request.rs:1817 — The _ => {} arm in push_file_change_skipping_symlinks silently swallows IO errors from symlink_metadata (e.g. permissions denied). The original is_file() also returned false on errors, so this isn't a regression, but those errors are now invisible where a warn! would aid diagnostics:

    Err(e) => {
        warn!("Failed to read metadata for {}: {} — skipping", file_path, e);
    }
  • src/safeoutputs/create_pull_request.rs:1908old mode 120000 is rejected, which means a patch converting a symlink into a regular file is also blocked. That's not an attack vector; it's a legitimate operation. Consider tightening to only reject new file mode 120000 and new mode 120000 (the two lines that actually introduce symlinks), while allowing old mode 120000 + new mode 100644 patches through. Not critical — the current behavior is safe but will confuse users with symlink-cleanup patches.

✅ What Looks Good

  • All 8 is_file() sites are replaced — no call sites were missed (confirmed with grep).
  • symlink_metadata (lstat) is the correct syscall here; metadata (stat) would follow the symlink and defeat the fix.
  • The belt-and-suspenders validate_patch_paths addition correctly catches the patch-based attack vector independently of the worktree collection path.
  • Both new tests exercise realistic payloads (/proc/self/environ, old mode → new mode 120000) and pass cleanly.
  • Refactoring the repeated pattern into push_file_change_skipping_symlinks is clean — consistent behavior across all 8 sites is now enforced structurally rather than by convention.

Generated by Rust PR Reviewer for issue #549 · ● 709.8K ·

…atches

Two review fixes for the symlink-exfiltration PR:

1. push_file_change_skipping_symlinks: split the catch-all arm so IO errors from symlink_metadata (e.g. permissions denied) are surfaced via warn! rather than silently swallowed. Non-file, non-symlink entries (directories, fifos, sockets) are still silently skipped.

2. validate_patch_paths: only reject patch lines that INTRODUCE a symlink (new file mode 120000, new mode 120000). Patches that convert a symlink into a regular file (old mode 120000 + new mode 100644) or delete an existing symlink (deleted file mode 120000) are now allowed — they are legitimate cleanup operations and produce a symlink-free worktree, so they pose no exfiltration risk.

Adds two positive tests covering symlink→file conversion and symlink deletion; existing negative tests still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Security fix looks correct and well-structured — one minor logic gap to address, plus a suggestion.

Findings

🐛 Bugs / Logic Issues

  • validate_patch_paths: in_diff guard is not active for symlink mode-line checks in all formats.
    The old mode / new mode lines (space-separated, no file) appear before the --- a/ / +++ b/ lines in a git format-patch diff, and the in_diff flag is set to true on the diff --git line — so they are gated correctly. However, new file mode 120000 can theoretically appear as part of a commit message body before the first diff --git line (an unusual edge case, but the in_diff gate protects against accidental false-positives here anyway). ✅ No action needed, just confirming the guard is correct.

  • collect_changes_from_diff_tree — rename-with-modification no longer guards status_code != "R100" with a file-existence check. Before the PR: if status_code != "R100" && new_full_path.is_file(). After: the is_file() guard is gone and replaced by push_file_change_skipping_symlinks. If the file is absent (unexpected but possible in a corrupt worktree), the Err(e) arm in push_file_change_skipping_symlinks now emits a warning and skips rather than silently doing nothing. This is actually better behavior — just flagging it so reviewers are aware the semantics changed slightly (warn + skip vs. silent skip).

⚠️ Suggestions

  • src/safeoutputs/create_pull_request.rs:1811-1815 — symlink skip is silent to the PR author. The warn! log is Stage 3 infrastructure noise; the agent that created the PR will never see it. If a symlink is legitimately included in a PR (unusual but not impossible), the resulting PR will silently be missing that path's content, which could be confusing. Consider surfacing this in the ExecutionResult metadata (e.g., a warnings field) or including it in the PR description that gets posted. Not a blocker, but a UX rough edge.

  • validate_patch_paths does not explicitly reject new file mode 120000 (trailing space/digits like new file mode 1200001). starts_with("new file mode 120000") is satisfied by any line that begins with that prefix. This is actually fine for rejection (overly strict → false rejects are safe), but worth noting that it won't accidentally allow a bypass via suffix characters.

✅ What Looks Good

  • The primary fix (symlink_metadata instead of is_file()) is exactly the right approach — symlink_metadata calls lstat(2) and will never cross a symlink boundary.
  • The push_file_change_skipping_symlinks helper correctly DRYs up 8 previously duplicated call sites.
  • The belt-and-suspenders ordering is correct: validate_patch_paths is called at line 725 before git am/git apply runs, so symlinks never land in the worktree in the first place; push_file_change_skipping_symlinks is a second independent safety net for anything that bypassed the patch validator.
  • test_validate_patch_paths_symlink_to_file_allowed and test_validate_patch_paths_symlink_deletion_allowed are important regression tests that confirm the intentional allowances — well done including those.
  • All error paths in push_file_change_skipping_symlinks propagate correctly with Ok(()) return and no unwrap.

Generated by Rust PR Reviewer for issue #549 · ● 454.8K ·

@jamesadevine
Copy link
Copy Markdown
Collaborator

@copilot Address bugs and suggestions

…hten patch check

Addresses PR #549 review feedback ("Address bugs and suggestions"):

1. Skipped symlinks are no longer invisible to the PR author.
   collect_changes_from_worktree and collect_changes_from_diff_tree now return
   (Vec<changes>, Vec<skipped_symlink_paths>). When non-empty, an explicit
   `> [!WARNING]` block listing each skipped path is appended to the PR
   description before posting, so the agent can see that some intended file
   content was deliberately dropped for safety. The Stage 3 warn! log is also
   retained for infrastructure observability.

2. Tightened patch-validation against suffix bypass.
   validate_patch_paths now compares the trimmed line for exact equality with
   "new file mode 120000" / "new mode 120000", rather than using starts_with.
   This eliminates any ambiguity from hypothetical future mode strings sharing
   the 120000 prefix, while still catching the real attack vectors. A new
   test exercises trailing whitespace and patch-body content containing
   '120000' to confirm neither bypasses nor false-rejects occur.

3. New tests:
   - test_validate_patch_paths_symlink_mode_suffix_not_bypass
   - test_append_skipped_symlink_notice_empty_is_passthrough
   - test_append_skipped_symlink_notice_lists_paths

   All 48 create_pull_request tests pass; full bin suite (1510 tests) green;
   cargo clippy clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Security fix is correct and well-structured — one minor unsafe markdown embedding to note, everything else looks solid.

Findings

⚠️ Suggestions

  • src/safeoutputs/create_pull_request.rs:1907-1909 — Symlink paths are embedded in the PR description markdown without escaping backticks:
    notice.push_str("> - `");
    notice.push_str(path);   // path can contain backtick characters
    notice.push_str("`\n");
    Unix filenames may legally contain backtick characters (e.g. foo`bar), which would break out of the inline-code context and garble the blockquote. Paths with newlines would be even more disruptive. Since the agent is the adversary here the risk is display-only (not a secondary exfiltration), but a quick fix is to replace backticks in the path:
    notice.push_str("> - `");
    notice.push_str(&path.replace('`', "\\`"));
    notice.push_str("`\n");

✅ What Looks Good

  • Core fix is exactly right: symlink_metadata calls lstat(2) and never crosses a symlink boundary. All eight former is_file() call sites are replaced — confirmed exhaustive.
  • Previous review items all addressed: Err(e) arm now emits a warn!; old mode 120000 is correctly not rejected (symlink→regular-file conversion passes); skipped paths are surfaced in the PR description rather than silently lost in infrastructure logs; duplicate match arms collapsed into a single helper.
  • validate_patch_paths belt-and-suspenders is correct: the in_diff guard (set on diff --git, cleared on From ) correctly gates the mode-line check so commit message bodies that happen to contain new file mode 120000 don't false-fire. The exact-trim equality (not starts_with) correctly rejects trailing-whitespace variants while accepting unrelated body content containing 120000.
  • Test coverage is thorough: the symlink-to-file-allowed and symlink-deletion-allowed tests are the important regression cases that confirm intentional allowances — good that they were included.

Generated by Rust PR Reviewer for PR #549

Generated by Rust PR Reviewer for issue #549 · ● 377.4K ·

…arkdown

Addresses follow-up review feedback on PR #549.

The append_skipped_symlink_notice helper previously embedded raw filenames
inside an inline-code span in the PR description. Filenames may legally
contain backticks (e.g. `foo�ar`) or control characters (newlines, tabs),
which would terminate the code span and garble or break out of the blockquote.

The agent is the adversary in this code path, so the risk is display-only
(no secondary exfiltration vector), but the previous output was malformed
when adversarial filenames were involved.

CommonMark code spans do NOT honour backslash escapes — the backtick-count
rule terminates the span instead — so the naive `path.replace('`', `\\`)`
suggested in review is not actually an escape. Instead, sanitize_path_for_markdown:

  - Replaces backticks with apostrophes (visually clear, terminator-safe).
  - Collapses all ASCII control characters (newline, CR, tab, etc.) to '?'.

Display-only sanitisation: the canonical path the agent originally requested
is unchanged in the upload pipeline; only the markdown rendering of the
skipped-symlinks notice is affected.

Adds four targeted tests covering backticks, control characters, pass-through
of normal paths, and end-to-end sanitisation through append_skipped_symlink_notice.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid, well-layered security fix — looks good with one minor correctness note worth a second look.


Findings

⚠️ Suggestions

  • src/safeoutputs/create_pull_request.rs:2027trim_end() instead of trim() in mode check

    let trimmed = line.trim_end();
    trimmed == "new file mode 120000" || trimmed == "new mode 120000"

    trim_end() strips trailing whitespace (including \r) but leaves leading whitespace intact. A line like " new file mode 120000" (leading space) would therefore silently bypass the check. Git's own format-patch / diff never emits leading-indented mode lines, so this is not a realistic attack path today, but trim() costs nothing and removes the gap entirely. The test suite covers trailing-whitespace bypass but has no coverage for the leading-whitespace case.

  • push_file_change_skipping_symlinks error arm — misleading warn text

    Err(e) => {
        warn!("Failed to read metadata for {}: {} — skipping", file_path, e);
    }

    Err on symlink_metadata fires for both "file doesn't exist" (a normal transient condition during cherry-pick/rebase worktree setup) and "permission denied" (suspicious). Both are silently skipped with the same message, making it harder to triage whether a skipped file was an attack or a benign race. Consider if e.kind() == io::ErrorKind::NotFound { debug!(...) } else { warn!(...) }. Minor, but avoids alert fatigue in Stage 3 logs.

✅ What Looks Good

  • Core fix is correct: replacing is_file() (stat, follows symlinks) with tokio::fs::symlink_metadata() (lstat, does not follow) is exactly the right syscall. The invariant is enforced in one place via the new helper, not scattered across 7 match arms — good refactor.
  • Defense-in-depth: validate_patch_paths now rejects symlink-introducing mode lines (new file mode 120000, new mode 120000) before the patch is applied, while push_file_change_skipping_symlinks is the safety net at read time. The deliberate allowance of old mode 120000 (symlink→regular file conversion) is correct and well-documented.
  • sanitize_path_for_markdown: replacing backticks with apostrophes and collapsing control characters is the right approach for CommonMark inline-code spans (backslash escapes don't work inside code spans). The end-to-end test test_append_skipped_symlink_notice_sanitizes_paths verifies this correctly.
  • Test coverage: 7 new unit tests cover the important edge cases — symlink rejection, mode-change rejection, symlink-to-file allowance, deletion allowance, trailing-whitespace bypass prevention, and path sanitization. The test_validate_patch_paths_symlink_mode_suffix_not_bypass test is particularly thorough.
  • PR-description surfacing of skipped paths: surfacing skipped symlinks in the PR description is a thoughtful UX detail — Stage 3 infra logs are invisible to the agent that submitted the PR, so this prevents silent data loss from the author's perspective.

Generated by Rust PR Reviewer for issue #549 · ● 499.1K ·

…etadata errors

Two minor review fixes on PR #549.

1. validate_patch_paths now compares mode lines after a full trim() rather
   than trim_end(). trim_end() leaves leading whitespace intact, so a line
   like ' new file mode 120000' would silently bypass the check. Git's own
   format-patch never produces leading-indented mode lines so this was not a
   realistic attack path, but trim() costs nothing and closes the gap. The
   existing test now also covers the leading-whitespace and CRLF cases.

2. push_file_change_skipping_symlinks now distinguishes io::ErrorKind::NotFound
   from other metadata errors. NotFound is a normal transient condition
   (worktree mid-rebase, file pruned by git apply, etc.) and is logged at
   debug level. PermissionDenied and other unusual kinds remain at warn so
   triage isn't drowned by alert fatigue from benign races.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid, well-structured security fix — the belt-and-suspenders approach is good. One meaningful UX gap and two minor issues worth addressing.

Findings

🐛 Bugs / Logic Issues

  • src/safeoutputs/create_pull_request.rs:972–993 — All-symlinks early-return doesn't surface reason to the agent

    When every file collected for the PR is a symlink, changes will be empty, skipped_symlinks will be non-empty, and the code returns early at the if changes.is_empty() block with the generic message "No changes detected after applying patch". The agent gets no indication that its files were dropped for safety (the symlink warning is only visible in Stage 3 infrastructure logs). This is the hardest failure mode to debug from the agent's perspective.

    Suggested fix: embed the skip count in the early-return messages, e.g.:

    let symlink_suffix = if skipped_symlinks.is_empty() {
        String::new()
    } else {
        format!(" ({} symlink(s) were skipped for safety)", skipped_symlinks.len())
    };
    return Ok(ExecutionResult::failure(
        format!("No changes detected after applying patch{}", symlink_suffix),
    ));

⚠️ Suggestions

  • src/safeoutputs/create_pull_request.rs:2035–2038trim() can produce a false-positive on diff context lines

    The check uses line.trim() == "new file mode 120000" inside the in_diff block. A diff context line (single-space prefix) containing the literal file content new file mode 120000 would become "new file mode 120000" after trim() and be falsely rejected. Added lines (+-prefixed) and removed lines (--prefixed) are immune because their non-whitespace prefix survives trim(). Context lines are not.

    In practice this is vanishingly unlikely, but trim_start_matches(|c: char| c.is_ascii_whitespace()) → checking starts_with("new file mode 120000") while also checking there's no trailing non-space text, or anchoring to only fire when line starts with a non-whitespace character (i.e. a real header line), would make this airtight.

  • src/safeoutputs/create_pull_request.rs:1867–1869 — Theoretical TOCTOU between symlink_metadata and read (lstat-then-open)

    push_file_change_skipping_symlinks uses symlink_metadata() (lstat) and then, if the file looks regular, calls read_file_changetokio::fs::read() (which follows symlinks). An atomic rename(2) between the two calls could swap in a symlink. This is not practically exploitable because Stage 3's worktree has no concurrent writer, but it's worth noting for completeness. The proper mitigation would be opening the file with O_NOFOLLOW and reading from the fd, but that requires platform-specific code and is overkill given the deployment model.

✅ What Looks Good

  • The layered defence (patch validation before git apply and symlink_metadata guard during file collection) is the right approach — each layer independently blocks the attack even if the other is somehow bypassed.
  • The append_skipped_symlink_notice + sanitize_path_for_markdown chain correctly prevents adversarial filenames (backtick injection, embedded newlines) from breaking the PR description's blockquote layout. The test coverage for this is thorough.
  • Collapsing 7 duplicated match blocks into push_file_change_skipping_symlinks is a clean refactor with a single place to evolve the no-follow invariant.
  • The deleted file mode 120000 / old mode 120000 allow-list reasoning is clearly documented and correct — only destructive symlink-to-regular-file conversions are permitted.

Generated by Rust PR Reviewer for issue #549 · ● 729K ·

… non-ws header lines

Three follow-up review fixes on PR #549.

1. All-symlinks early-return now reports the cause.
   When every file collected for the PR is a symlink, `changes` is empty
   and the executor previously returned the generic `No changes detected
   after applying patch` message ÔÇö the agent had no way to see that its
   files were dropped for safety (the per-symlink `warn!` only reached
   Stage 3 infrastructure logs). The IfNoChanges Error/Warn/Ignore branches
   now append ` (N symlink(s) were skipped for safety: path1, path2, )`
   to both the log and the ExecutionResult message whenever skipped_symlinks
   is non-empty.

2. validate_patch_paths mode check anchored to no-leading-whitespace lines.
   Previously `line.trim() == "new file mode 120000"` could be triggered by
   a diff CONTEXT line ( `" new file mode 120000"`, with a single leading
   space) ÔÇö after trim() the two are indistinguishable. Real git header
   lines never start with whitespace, while diff context lines always do, so
   the check now also requires the first character of the raw line to be
   non-whitespace. Trailing whitespace and `\r` still bypass-proofed by
   `trim()` on the right-hand side of the equality.
   Added a new explicit test that a context line whose body is literally
   `new file mode 120000` is allowed through. (The test patch is built with
   `String + &str` concatenation rather than backslash line-continuation,
   because Rust eats next-line leading whitespace after `\`, which would
   silently defeat the property under test.)

3. Documented TOCTOU assumption.
   Added a doc comment to push_file_change_skipping_symlinks acknowledging the
   theoretical TOCTOU window between symlink_metadata (lstat) and the
   subsequent tokio::fs::read inside read_file_change. Not exploitable in
   Stage 3's deployment model (no concurrent writer to the worktree), but
   the assumption is now explicit, with a pointer to the proper mitigation
   (O_NOFOLLOW fd reads) should the deployment model change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Well-designed fix — the two-layer defense (patch validation + runtime lstat) is solid and the test coverage is comprehensive. A couple of minor observations.

Findings

⚠️ Suggestions

  • sanitize_path_for_markdown — Unicode bidi override characters not filtered (src/safeoutputs/create_pull_request.rs:1974): char::is_control() in Rust covers U+0000–U+001F and U+007F–U+009F only. Unicode bidirectional formatting characters like U+202E (RIGHT-TO-LEFT OVERRIDE) and U+2066 (LEFT-TO-RIGHT ISOLATE) pass through unchanged. Inside a GitHub code span these can visually reverse/obfuscate the displayed filename (e.g. a symlink named txt.secret\u{202e}rs. would display as rs.secret.txt). No security bypass — the symlink itself is already blocked — but it could deceive a PR author into misreading the warning. Consider additionally filtering char::is_ascii() inverts or adding an explicit bidi-char check:

    c if c.is_control() || matches!(c,
        '\u{200B}'..='\u{200F}' | '\u{202A}'..='\u{202E}' |
        '\u{2066}'..='\u{2069}' | '\u{FEFF}') => '?',
  • symlink_suffix in ExecutionResult messages uses raw (unsanitized) paths (src/safeoutputs/create_pull_request.rs:~980): The skipped_symlinks.join(", ") in the failure/success/warning result strings embeds raw filenames without any sanitization. This is a much lower-risk path (plain text, not markdown), but a filename containing commas or unusual Unicode would make the message ambiguous. Since sanitize_path_for_markdown already exists, it could easily be reused here for consistency.

✅ What Looks Good

  • Two-layer defense is the right design: Rejecting new file mode 120000 / new mode 120000 in validate_patch_paths stops the attack before the worktree is even modified; the symlink_metadata check in push_file_change_skipping_symlinks catches any pre-existing or post-apply symlinks that slip through. Belt-and-suspenders is correct for this threat model.
  • TOCTOU acknowledged correctly: The comment at line 1891 accurately explains why the lstat→read window is not exploitable in Stage 3's serial-collector/no-concurrent-writer model and prescribes the right mitigation (O_NOFOLLOW) if the deployment model changes.
  • validate_patch_paths mode-line check is well-anchored: The !starts_with_ws guard correctly distinguishes real git extended-header lines from diff-context lines that happen to contain the same string as body text, and trim() on the right-hand side ensures CRLF / trailing-whitespace can't bypass the exact-equality check.
  • Refactor is clean: Collapsing 7 nearly-identical match arms into push_file_change_skipping_symlinks is the right call — the no-follow invariant is now enforced in exactly one place.
  • Test coverage is thorough: All the interesting edge cases — deletion of existing symlinks, symlink→regular-file conversion, context-line false-positive, trailing whitespace bypass, CRLF bypass — are explicitly tested.

Generated by Rust PR Reviewer for issue #549 · ● 434K ·

…executor messages

Two follow-up review fixes on PR #549.

1. sanitize_path_for_markdown now filters Unicode bidi/zero-width formatting
   characters in addition to ASCII control chars and backticks.
   char::is_control() only covers U+0000-U+001F and U+007F-U+009F, so
   characters like U+202E (RIGHT-TO-LEFT OVERRIDE), U+2066 (LRI), and U+FEFF
   (BOM) previously passed through unchanged. A filename containing U+202E
   could visually reverse part of its displayed name inside the GitHub code
   span and deceive a PR author into misreading the skipped-symlinks warning.
   No exfiltration vector (the symlink is already blocked) but a real
   display-spoofing concern; now collapsed to '?' alongside other formatting
   chars.
   Explicit ranges filtered: U+200B-U+200F (ZW joiners + LRM/RLM),
   U+202A-U+202E (LRE/RLE/PDF/LRO/RLO), U+2066-U+2069 (LRI/RLI/FSI/PDI),
   U+FEFF. Ordinary non-ASCII letters/emoji/CJK pass through unchanged.

2. ExecutionResult symlink_suffix paths now reuse sanitize_path_for_markdown.
   The early-return failure/success/warning messages previously embedded raw
   skipped-symlink filenames via skipped_symlinks.join(', '). Filenames
   containing commas, control chars, or bidi controls could garble or spoof
   the message. Sanitizing per-path before joining keeps the message
   readable and consistent with the PR-description block.

Two new tests:
  - test_sanitize_path_for_markdown_filters_bidi_controls (U+202E + sample set)
  - test_sanitize_path_for_markdown_keeps_normal_unicode (cafe / Japanese /
    emoji passthrough)

Full suite (1516 tests) green; cargo clippy clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Rust PR Review

Summary: Solid security fix — correct defense-in-depth approach, good test coverage, looks good to merge.

Findings

✅ What Looks Good

  • push_file_change_skipping_symlinks helper correctly uses tokio::fs::symlink_metadata (lstat, no-follow) as the primary defense. Centralizing it in one place ensures the invariant can't be accidentally regressed in future callers.
  • validate_patch_paths mode check is correctly placed inside the in_diff guard (lines 2076–2124), so it only fires on actual git diff headers, not commit message bodies. The !starts_with_ws anchor correctly rejects the check for context lines (" new file mode 120000") while catching trailing-whitespace/CRLF variants via trim(). The tests for that edge case are thorough.
  • sanitize_path_for_markdown handles backticks, ASCII control chars, and the explicit bidi/zero-width ranges (U+200B–U+200F, U+202A–U+202E, U+2066–U+2069, U+FEFF). The doc comment accurately explains why char::is_control() alone is insufficient.
  • deleted file mode 120000 correctly not blocked — only additions and mode-changes-to-symlink are rejected.
  • old mode 120000 + new mode 100644 (symlink→regular-file cleanup) is correctly allowed, and the test asserts that specifically.
  • All three ExecutionResult branches (Error / Warn / Ignore) surface the skipped-symlink suffix, so agents see the diagnostic regardless of if-no-changes config.

⚠️ Suggestions

  • [src/safeoutputs/create_pull_request.rssanitize_path_for_markdown] The function is private and lives in the same file as its callers, which is fine. If another safe-output tool ever embeds user-controlled paths in markdown output (e.g., comment_on_work_item), there's a discoverability risk. Consider moving it to src/sanitize.rs alongside validate_single_path and the existing sanitizers — no behaviour change, just a home that future contributors will look in naturally.

  • [src/safeoutputs/create_pull_request.rs:294–303 — TOCTOU comment] The TOCTOU window between symlink_metadata and read_file_change is correctly documented and the reasoning ("no concurrent writer at this point in Stage 3") is sound. Worth keeping as-is; noting it here just so reviewers are aware it was analysed and consciously accepted.

  • [teststest_validate_patch_paths_symlink_mode_change_rejected] The test uses Rust backslash-continuation ("...\n\<newline>whitespace"), which strips indentation from the next source line. Result: the patch string's old mode 100644 and new mode 120000 lines have no leading whitespace in the actual string, which is the correct form for a real git diff header — so the test is valid. Just noting the subtlety for future editors who might re-indent the string and accidentally introduce leading spaces that would change the test's semantics (adding a short comment there would prevent confusion).

Generated by Rust PR Reviewer for issue #549 · ● 598.9K ·

@jamesadevine jamesadevine merged commit f04c033 into main May 17, 2026
8 checks passed
@jamesadevine jamesadevine deleted the copilot/fix-symlink-exfiltration branch May 17, 2026 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🔴 Red Team Audit — High: create-pr patch can contain symlink to exfiltrate Stage 3 write tokens

2 participants