fix(safeoutputs): prevent symlink exfiltration in create-pr Stage 3#549
Conversation
|
@copilot Fix up the duplicate code here |
|
@copilot there's a lot of duplication in this PR - can we sort that out? |
Replace is_file() (follows symlinks) with symlink_metadata() checks in collect_changes_from_worktree and collect_changes_from_diff_tree so that symlinks in the applied patch are silently skipped rather than followed to arbitrary filesystem paths (e.g. /proc/self/environ). Also add symlink mode (120000) detection to validate_patch_paths as a belt-and-suspenders defence: patches that create or convert files to symlinks are rejected before git apply is attempted. Adds two unit tests covering the new rejection paths. Agent-Logs-Url: https://github.com/githubnext/ado-aw/sessions/99d4d7f5-eaea-4a34-87e3-ac34bab53ddb Co-authored-by: jamesadevine <4742697+jamesadevine@users.noreply.github.com>
711e24f to
08e1fc2
Compare
Collapse 7 duplicated symlink_metadata match blocks introduced by the symlink-exfiltration fix into a single helper. Same security semantics: regular file → push change; symlink → warn and skip; other → ignore. Centralizes the use of symlink_metadata so the no-follow invariant lives in one place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good — the security fix is correct and well-structured. Two minor suggestions below. Findings
|
…atches Two review fixes for the symlink-exfiltration PR: 1. push_file_change_skipping_symlinks: split the catch-all arm so IO errors from symlink_metadata (e.g. permissions denied) are surfaced via warn! rather than silently swallowed. Non-file, non-symlink entries (directories, fifos, sockets) are still silently skipped. 2. validate_patch_paths: only reject patch lines that INTRODUCE a symlink (new file mode 120000, new mode 120000). Patches that convert a symlink into a regular file (old mode 120000 + new mode 100644) or delete an existing symlink (deleted file mode 120000) are now allowed — they are legitimate cleanup operations and produce a symlink-free worktree, so they pose no exfiltration risk. Adds two positive tests covering symlink→file conversion and symlink deletion; existing negative tests still pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Security fix looks correct and well-structured — one minor logic gap to address, plus a suggestion. Findings🐛 Bugs / Logic Issues
|
|
@copilot Address bugs and suggestions |
…hten patch check Addresses PR #549 review feedback ("Address bugs and suggestions"): 1. Skipped symlinks are no longer invisible to the PR author. collect_changes_from_worktree and collect_changes_from_diff_tree now return (Vec<changes>, Vec<skipped_symlink_paths>). When non-empty, an explicit `> [!WARNING]` block listing each skipped path is appended to the PR description before posting, so the agent can see that some intended file content was deliberately dropped for safety. The Stage 3 warn! log is also retained for infrastructure observability. 2. Tightened patch-validation against suffix bypass. validate_patch_paths now compares the trimmed line for exact equality with "new file mode 120000" / "new mode 120000", rather than using starts_with. This eliminates any ambiguity from hypothetical future mode strings sharing the 120000 prefix, while still catching the real attack vectors. A new test exercises trailing whitespace and patch-body content containing '120000' to confirm neither bypasses nor false-rejects occur. 3. New tests: - test_validate_patch_paths_symlink_mode_suffix_not_bypass - test_append_skipped_symlink_notice_empty_is_passthrough - test_append_skipped_symlink_notice_lists_paths All 48 create_pull_request tests pass; full bin suite (1510 tests) green; cargo clippy clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Security fix is correct and well-structured — one minor unsafe markdown embedding to note, everything else looks solid. Findings
|
…arkdown Addresses follow-up review feedback on PR #549. The append_skipped_symlink_notice helper previously embedded raw filenames inside an inline-code span in the PR description. Filenames may legally contain backticks (e.g. `foo�ar`) or control characters (newlines, tabs), which would terminate the code span and garble or break out of the blockquote. The agent is the adversary in this code path, so the risk is display-only (no secondary exfiltration vector), but the previous output was malformed when adversarial filenames were involved. CommonMark code spans do NOT honour backslash escapes — the backtick-count rule terminates the span instead — so the naive `path.replace('`', `\\`)` suggested in review is not actually an escape. Instead, sanitize_path_for_markdown: - Replaces backticks with apostrophes (visually clear, terminator-safe). - Collapses all ASCII control characters (newline, CR, tab, etc.) to '?'. Display-only sanitisation: the canonical path the agent originally requested is unchanged in the upload pipeline; only the markdown rendering of the skipped-symlinks notice is affected. Adds four targeted tests covering backticks, control characters, pass-through of normal paths, and end-to-end sanitisation through append_skipped_symlink_notice. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid, well-layered security fix — looks good with one minor correctness note worth a second look. Findings
|
…etadata errors Two minor review fixes on PR #549. 1. validate_patch_paths now compares mode lines after a full trim() rather than trim_end(). trim_end() leaves leading whitespace intact, so a line like ' new file mode 120000' would silently bypass the check. Git's own format-patch never produces leading-indented mode lines so this was not a realistic attack path, but trim() costs nothing and closes the gap. The existing test now also covers the leading-whitespace and CRLF cases. 2. push_file_change_skipping_symlinks now distinguishes io::ErrorKind::NotFound from other metadata errors. NotFound is a normal transient condition (worktree mid-rebase, file pruned by git apply, etc.) and is logged at debug level. PermissionDenied and other unusual kinds remain at warn so triage isn't drowned by alert fatigue from benign races. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid, well-structured security fix — the belt-and-suspenders approach is good. One meaningful UX gap and two minor issues worth addressing. Findings🐛 Bugs / Logic Issues
|
… non-ws header lines Three follow-up review fixes on PR #549. 1. All-symlinks early-return now reports the cause. When every file collected for the PR is a symlink, `changes` is empty and the executor previously returned the generic `No changes detected after applying patch` message ÔÇö the agent had no way to see that its files were dropped for safety (the per-symlink `warn!` only reached Stage 3 infrastructure logs). The IfNoChanges Error/Warn/Ignore branches now append ` (N symlink(s) were skipped for safety: path1, path2, ÔǪ)` to both the log and the ExecutionResult message whenever skipped_symlinks is non-empty. 2. validate_patch_paths mode check anchored to no-leading-whitespace lines. Previously `line.trim() == "new file mode 120000"` could be triggered by a diff CONTEXT line ( `" new file mode 120000"`, with a single leading space) ÔÇö after trim() the two are indistinguishable. Real git header lines never start with whitespace, while diff context lines always do, so the check now also requires the first character of the raw line to be non-whitespace. Trailing whitespace and `\r` still bypass-proofed by `trim()` on the right-hand side of the equality. Added a new explicit test that a context line whose body is literally `new file mode 120000` is allowed through. (The test patch is built with `String + &str` concatenation rather than backslash line-continuation, because Rust eats next-line leading whitespace after `\`, which would silently defeat the property under test.) 3. Documented TOCTOU assumption. Added a doc comment to push_file_change_skipping_symlinks acknowledging the theoretical TOCTOU window between symlink_metadata (lstat) and the subsequent tokio::fs::read inside read_file_change. Not exploitable in Stage 3's deployment model (no concurrent writer to the worktree), but the assumption is now explicit, with a pointer to the proper mitigation (O_NOFOLLOW fd reads) should the deployment model change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-designed fix — the two-layer defense (patch validation + runtime lstat) is solid and the test coverage is comprehensive. A couple of minor observations. Findings
|
…executor messages Two follow-up review fixes on PR #549. 1. sanitize_path_for_markdown now filters Unicode bidi/zero-width formatting characters in addition to ASCII control chars and backticks. char::is_control() only covers U+0000-U+001F and U+007F-U+009F, so characters like U+202E (RIGHT-TO-LEFT OVERRIDE), U+2066 (LRI), and U+FEFF (BOM) previously passed through unchanged. A filename containing U+202E could visually reverse part of its displayed name inside the GitHub code span and deceive a PR author into misreading the skipped-symlinks warning. No exfiltration vector (the symlink is already blocked) but a real display-spoofing concern; now collapsed to '?' alongside other formatting chars. Explicit ranges filtered: U+200B-U+200F (ZW joiners + LRM/RLM), U+202A-U+202E (LRE/RLE/PDF/LRO/RLO), U+2066-U+2069 (LRI/RLI/FSI/PDI), U+FEFF. Ordinary non-ASCII letters/emoji/CJK pass through unchanged. 2. ExecutionResult symlink_suffix paths now reuse sanitize_path_for_markdown. The early-return failure/success/warning messages previously embedded raw skipped-symlink filenames via skipped_symlinks.join(', '). Filenames containing commas, control chars, or bidi controls could garble or spoof the message. Sanitizing per-path before joining keeps the message readable and consistent with the PR-description block. Two new tests: - test_sanitize_path_for_markdown_filters_bidi_controls (U+202E + sample set) - test_sanitize_path_for_markdown_keeps_normal_unicode (cafe / Japanese / emoji passthrough) Full suite (1516 tests) green; cargo clippy clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid security fix — correct defense-in-depth approach, good test coverage, looks good to merge. Findings✅ What Looks Good
|
Stage 3's
create-prfile collection usedis_file()(follows symlinks), allowing an agent to plant a symlink (e.g.ln -s /proc/self/environ secrets.txt) that would be read and uploaded to ADO as PR file content — exfiltratingSYSTEM_ACCESSTOKENand any other Stage 3 environment secrets.Changes
Runtime fix — symlink-blind file collection (
collect_changes_from_worktree,collect_changes_from_diff_tree)All worktree
is_file()call sites now go through a single helper,push_file_change_skipping_symlinks, that usestokio::fs::symlink_metadata()(lstat(2)) so symlinks are detected without ever being followed:warn!log, record the path, and skip.warn!and skip.The two
collect_changes_from_*helpers now return(Vec<change>, Vec<skipped_symlink_path>)so the executor can surface the skipped list back to the PR author.PR-description surfacing of skipped symlinks
If any symlinks were skipped during collection, an explicit GitHub-flavoured
> [!WARNING]block listing each skipped path is appended to the PR description before it is posted. The agent that produced the PR therefore sees that some intended content was deliberately dropped for safety, rather than the silent omission previously only visible in Stage 3 infrastructure logs.Belt-and-suspenders — patch validation (
validate_patch_paths)validate_patch_pathsnow rejects patches whose trimmed line is exactly:new file mode 120000— freshly added symlinknew mode 120000— existing file converted to a symlinkExact-line equality (rather than
starts_with) prevents any ambiguity from hypothetical future mode strings sharing the120000prefix.old mode 120000is deliberately allowed: it appears in legitimate symlink→regular-file conversions and never produces a symlink in the resulting worktree.Refactor — collapse 7 duplicated match blocks
The PR originally introduced 7 nearly identical
symlink_metadatamatch arms across the two collect functions. These now live in a singlepush_file_change_skipping_symlinkshelper, with the no-follow invariant enforced in one place.Test plan
New unit tests:
test_validate_patch_paths_symlink_rejected—new file mode 120000patch rejected.test_validate_patch_paths_symlink_mode_change_rejected— file→symlink mode change rejected.test_validate_patch_paths_symlink_to_file_allowed— symlink→regular file conversion allowed.test_validate_patch_paths_symlink_deletion_allowed— symlink deletion allowed.test_validate_patch_paths_symlink_mode_suffix_not_bypass— trailing-whitespace and120000-in-body cases neither bypass nor false-reject.test_append_skipped_symlink_notice_empty_is_passthrough— empty skipped list leaves description untouched.test_append_skipped_symlink_notice_lists_paths— non-empty skipped list produces a properly formatted warning block.cargo test— all 48create_pull_requestunit tests pass; fullcargo test --bin ado-aw(1510 tests) green;cargo clippy --all-targetsclean.