You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(compile): default executor to System.AccessToken and add always-on Azure CLI (#873)
* feat(compile): default executor to System.AccessToken and add always-on Azure CLI
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor(compile): detect az at pipeline time so missing azure-cli no longer crashes 1ES
Reviewer-requested fix: static AWF bind-mounts for /opt/az and /usr/bin/az
would break `docker run` on runners without azure-cli pre-installed (notably
some 1ES self-hosted pools), failing the pipeline before the agent ever
started.
Replace the static mounts with a runtime detection prepare step that sets
the ADO pipeline variable AW_AZ_MOUNTS via `##vso[task.setvariable]` when
both /usr/bin/az and /opt/az exist on the host, or emits a `task.logissue`
warning and leaves the variable unset otherwise.
The AWF invocation in the compiled YAML now includes a single
`$(AW_AZ_MOUNTS) \` line in the --mount chain. ADO interpolates the variable
at step start: present -> the two --mount args appear; absent -> the line
collapses to whitespace. No new trait method is added; only the existing
`prepare_steps` hook is used.
- AzureCliExtension: required_awf_mounts() now returns []; prepare_steps()
emits the detection bash step
- generate_awf_mounts: appends `$(AW_AZ_MOUNTS) \` when AzureCli is present
- Tests: rewrite static-mount assertions to assert the detection step + the
pipeline variable injection, plus a regression guard that no static az
mount is emitted
- Docs: docs/network.md and docs/tools.md updated with the runtime-detection
design and operator implications
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(compile): graceful-degradation bug + cleanup per PR review
Address two findings from the Rust PR Reviewer bot on PR #873.
1. CRITICAL — AW_AZ_MOUNTS undefined when az is missing:
The runtime-detection step in AzureCliExtension only set the
AW_AZ_MOUNTS pipeline variable in the detected branch. In the
missing-az branch the variable was left undefined. ADO leaves an
undefined $(VAR) as the LITERAL STRING "$(VAR)" in subsequent bash
steps (it does NOT expand to empty). Bash sees $(AW_AZ_MOUNTS),
interprets it as a $(...) command substitution, tries to execute
a program named AW_AZ_MOUNTS, gets exit 127, and the AWF
invocation step dies under `set -e` — the exact 1ES failure mode
the refactor set out to prevent.
Fix: always emit `##vso[task.setvariable variable=AW_AZ_MOUNTS]`,
with an empty value in the missing branch. ADO then expands
$(AW_AZ_MOUNTS) to nothing and the trailing `\` line becomes a
harmless continuation no-op.
Regression guards (both lock this in):
- azure_cli.rs::test_azure_cli_prepare_steps_defines_aw_az_mounts_in_else_branch
counts `setvariable` occurrences (must be 2) and asserts the
else block contains an empty-value setvariable line.
- compiler_tests.rs::test_default_pipeline_mounts_az_and_allows_azure_hosts
asserts the same 2× count on the compiled lock.yml.
2. Cleanup — delete WRITE_REQUIRING_SAFE_OUTPUTS:
The const was retained with #[allow(dead_code)] after the
removal of `validate_write_permissions`, but its only consumers
left were the two tests that exercised the const itself. Each
`*Result` type already carries `REQUIRES_WRITE: bool` for any
caller (compiler, audit, runtime) that needs the same info.
Deleting the const removes a dead-code annotation and one
otherwise-purposeless list to maintain when adding new tools.
Test cleanup: removed `test_write_requiring_subset_of_all_known`
(purely exercised the deleted const) and rewrote
`test_all_known_completeness` to use a HashSet-based duplicate
check on ALL_KNOWN_SAFE_OUTPUTS plus the ALWAYS_ON/NON_MCP
disjointness check (preserves the meaningful invariants).
Validation:
- cargo build: clean
- cargo test: 1748 unit + 119 compiler + all integration pass
- cargo clippy --all-targets --all-features -- -D warnings: clean
- tests/safe-outputs/azure-cli.lock.yml regenerated (+1 line)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(compile): inject conditional Azure CLI advisory into the agent prompt
When the always-on AzureCli extension detects azure-cli on the host
(AW_AZ_MOUNTS non-empty), append an Azure CLI advisory section to
the agent prompt so the agent knows az is on PATH inside the
sandbox, what it's good for, and the auth model. Skip the append
when az is missing so the agent never tries to call az on a runner
that doesn't have it.
Design
======
AzureCliExtension::prepare_steps() now returns TWO YAML steps:
1. Detection (existing) — sets AW_AZ_MOUNTS to the two --mount
args or empty string.
2. NEW: "Append Azure CLI prompt" — a single-quoted heredoc that
cats an Azure CLI advisory into /tmp/awf-tools/agent-prompt.md,
gated by `condition: ne(variables['AW_AZ_MOUNTS'], '')`.
The CompilerExtension trait API is unchanged. wrap_prompt_append
is unchanged. The single call site in common.rs:2311 is unchanged.
prompt_supplement() on AzureCli stays None. The conditional
injection is entirely self-contained inside the extension's own
prepare_steps Vec.
Why not extend the trait. The existing prompt_supplement() hook
doesn't carry a step-level condition. Adding one would require a
new trait method, a new wrap_prompt_append signature, an enum-macro
arm update, and a call-site change in common.rs — disproportionate
for a 15-line advisory that only one extension wants gated.
Advisory content
================
The advisory assumes az IS available (no "may be" hedging — the
step only runs when it is) and covers:
- az devops family — autoauthed via $AZURE_DEVOPS_EXT_PAT when
permissions: read: is declared
- Azure Resource Manager — separate identity required, not
provisioned by ado-aw
- Microsoft Graph — same caveat as ARM
- Fallback — file a missing-tool safe output naming azure-cli
Heredoc terminator is SINGLE-QUOTED ('AZURE_CLI_PROMPT_EOF') so
$AZURE_DEVOPS_EXT_PAT and similar literals are appended verbatim
rather than being shell-expanded to the runner's PAT value. Locked
in by test_azure_cli_prompt_append_uses_single_quoted_heredoc.
Tests
=====
Five new unit tests in src/compile/extensions/azure_cli.rs:
- test_azure_cli_prompt_append_step_is_conditional
- test_azure_cli_prompt_append_step_targets_agent_prompt_file
- test_azure_cli_prompt_append_step_has_advisory_anchors
- test_azure_cli_prompt_append_uses_single_quoted_heredoc
- test_azure_cli_prompt_append_displayname_matches_lint_list
Existing test_azure_cli_prepare_steps_detects_az_before_setting_var
updated for the new vec length (2 instead of 1).
Integration test test_default_pipeline_mounts_az_and_allows_azure_hosts
extended to assert the displayName, the condition expression in the
same step (proximity check), and the advisory anchor strings.
tests/bash_lint_tests.rs REQUIRED_STEP_DISPLAY_NAMES gains "Append
Azure CLI prompt" so shellcheck exercises the new heredoc.
tests/safe-outputs/azure-cli.lock.yml regenerated.
Docs
====
docs/network.md and docs/tools.md "Always-on Azure CLI" sections
gain a paragraph describing the conditional advisory injection. The
same edits also correct a small carryover inaccuracy from commit
7fe562f: the previous text said "leaves AW_AZ_MOUNTS unset" — the
graceful-degradation fix actually sets it to the empty string. Now
documented correctly with the rationale.
Validation
==========
- cargo build: clean
- cargo test: 1753 unit + 119 compiler + all integration pass
- cargo clippy --all-targets --all-features -- -D warnings: clean
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(compile): silence shellcheck SC2046 on $(AW_AZ_MOUNTS) macro
The new `$(AW_AZ_MOUNTS)` line in the AWF invocation chain is an
ADO macro substituted before bash sees it, not a bash command
substitution. shellcheck cannot distinguish the two and flagged
every compiled fixture with SC2046 ("Quote this to prevent word
splitting"), turning Build & Test red.
Word splitting of the expanded value into separate `--mount` tokens
is intentional and required (the pipeline variable expands to
`--mount /opt/az:/opt/az:ro --mount /usr/bin/az:/usr/bin/az:ro` or
to the empty string). Quoting would produce a single malformed
token. Disable SC2046 with an inline directive on the `sudo -E`
line in all four base templates (base, 1es-base, job-base,
stage-base) so the directive applies to the multi-line awf
invocation as a unit.
Regenerated tests/safe-outputs/azure-cli.lock.yml; verified
bash_lint_tests passes locally with shellcheck 0.11.0.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
to `/tmp/awf-tools/agent-prompt.md`. The agent reads the prompt on
98
+
startup and learns that `az` is on PATH, what it's good for
99
+
(`az devops` autoauthed via `$AZURE_DEVOPS_EXT_PAT`, ARM and Graph
100
+
requiring separate auth), and the fallback path (`missing-tool`
101
+
safe output naming `azure-cli`).
102
+
103
+
The step is gated by `condition: ne(variables['AW_AZ_MOUNTS'], '')`,
104
+
which reuses the same pipeline variable the detection step writes.
105
+
On runners where `az` is missing, the advisory step is skipped
106
+
entirely — the agent never sees Azure CLI guidance and never tries
107
+
to call `az`, avoiding the "told to use `az`, fails with command
108
+
not found" failure mode.
109
+
110
+
### Operator implications
111
+
112
+
-**Microsoft-hosted `ubuntu-latest`**: `az` is detected, mounted, and
113
+
available inside the agent sandbox. Nothing to do.
114
+
-**1ES self-hosted runners *with* azure-cli baked in**: same as above.
115
+
-**1ES self-hosted runners *without* azure-cli**: the pipeline runs
116
+
successfully, but agents that invoke `az` get the standard
117
+
`command not found` inside the sandbox. The warning emitted by the
118
+
prepare step is visible in the ADO log as a yellow-flagged issue on
119
+
the build summary; treat it as a signal to either ignore (if no
120
+
agent on that runner needs `az`) or to install `azure-cli` on the
121
+
runner image.
122
+
123
+
See [`docs/tools.md`](tools.md#built-in-clis) for the agent-facing
124
+
contract (auth scope, available subcommands).
50
125
51
126
## Adding Additional Hosts
52
127
@@ -108,46 +183,83 @@ network:
108
183
109
184
## Permissions (ADO Access Tokens)
110
185
111
-
ADO does not support fine-grained permissions — there are two access levels: blanket read and blanket write. Tokens are minted from ARM service connections; `System.AccessToken` is never used for agent or executor operations.
186
+
ADO does not support fine-grained permissions — there are two access levels:
187
+
blanket read and blanket write. The executor (Stage 3) always has a
188
+
write-capable token; what changes is its *source* and *attribution*:
112
189
113
-
**Exception:** The trigger filter gate step (Setup job) uses `System.AccessToken`
114
-
for two purposes: (1) self-cancelling the build when filters don't match
115
-
(`PATCH` to `_apis/build/builds/{id}`), and (2) fetching PR metadata for
116
-
Tier 2 filters (labels, draft status, changed files). This runs in the
117
-
Setup job before the agent starts, outside the AWF sandbox. The pipeline
118
-
must have "Allow scripts to access the OAuth token" enabled for this to
119
-
work. This is a deliberate scoped exception — the token is not passed to
# write: my-write-arm-connection # Optional — see below
126
222
```
127
223
128
-
### Security Model
224
+
### When to set `permissions.write`
129
225
130
-
- **`permissions.read`**: Mints a read-only ADO-scoped token given to the agent inside the AWF sandbox (Stage 1). The agent can query ADO APIs but cannot write.
131
-
- **`permissions.write`**: Mints a write-capable ADO-scoped token used **only** by the executor in Stage 3 (`SafeOutputs` job). This token is never exposed to the agent.
132
-
- **Both omitted**: No ADO tokens are passed anywhere. The agent has no ADO API access.
226
+
The default (`$(System.AccessToken)`) is sufficient for the vast majority of
227
+
agents. Set `permissions.write` only when you need:
133
228
134
-
### Compile-Time Validation
229
+
1. **Cross-org or cross-project writes** — `System.AccessToken` is scoped to
230
+
the host project. Targeting work items or repos in a different ADO
231
+
project / organization requires an ARM SC with broader scope.
232
+
2. **Named-identity attribution** — `System.AccessToken` writes are
233
+
attributed to the `Project Collection Build Service` identity. An ARM SC
234
+
attributes writes to its underlying federated identity (e.g.
235
+
`safe-output-bot@contoso.com`), useful when audit logs or work-item
236
+
notifications need a specific actor.
135
237
136
-
If write-requiring safe-outputs (`create-pull-request`, `create-work-item`) are configured but `permissions.write` is missing, compilation fails with a clear error message.
238
+
### Security Model
239
+
240
+
- **`permissions.read`**: Mints a read-only ADO-scoped token given to the
241
+
agent inside the AWF sandbox (Stage 1). The agent can query ADO APIs but
242
+
cannot write.
243
+
- **`permissions.write` (optional)**: Mints a write-capable ADO-scoped token
244
+
used **only** by the executor in Stage 3 (`SafeOutputs` job). Overrides
245
+
the default `$(System.AccessToken)` for write operations. Never exposed
246
+
to the agent.
247
+
- **Both omitted**: The agent has no ADO API access. The executor still has
248
+
a write-capable token via `$(System.AccessToken)`, scoped by the
249
+
pipeline's job-authorization settings.
137
250
138
251
### Examples
139
252
140
253
```yaml
141
-
# Agent can read ADO, safe-outputs can write
254
+
# Default: agent can read ADO, executor writes via $(System.AccessToken).
142
255
permissions:
143
256
read: my-read-sc
144
-
write: my-write-sc
145
257
146
-
# Agent can read ADO, no write safe-outputs needed
258
+
# Cross-org / named-identity attribution — executor writes via ARM SC.
147
259
permissions:
148
260
read: my-read-sc
149
-
150
-
# Agent has no ADO access, but safe-outputs can create PRs/work items
151
-
permissions:
152
261
write: my-write-sc
262
+
263
+
# Agent has no ADO read access; executor still writes via $(System.AccessToken).
264
+
# (Empty front matter — no `permissions:` key at all.)
Copy file name to clipboardExpand all lines: docs/safe-outputs.md
+13-1Lines changed: 13 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,18 @@ safe-outputs:
37
37
38
38
Safe output configurations are passed to Stage 3 execution and used when processing safe outputs.
39
39
40
+
### Executor authentication
41
+
42
+
All write-bearing safe outputs (e.g. `create-pull-request`,
43
+
`create-work-item`, `add-pr-comment`, `upload-build-attachment`) run in the
44
+
Stage 3 `SafeOutputs` job and authenticate to Azure DevOps using
45
+
`SYSTEM_ACCESSTOKEN`. By default this is `$(System.AccessToken)` — the
46
+
pipeline's built-in OAuth token running as the *Project Collection Build
47
+
Service* identity. Set `permissions.write` to override this with an
48
+
ARM-minted token, e.g. for cross-org writes or named-identity attribution.
49
+
See [`docs/network.md`](network.md) and
50
+
[`docs/template-markers.md`](template-markers.md) for details.
51
+
40
52
## Available Safe Output Tools
41
53
42
54
### comment-on-work-item
@@ -604,7 +616,7 @@ multiple uploads.
604
616
**Notes:**
605
617
- Single-file only; directory uploads are not supported.
606
618
- When `build_id` is omitted and `allowed-build-ids` is configured, the allow-list check is skipped — the current build is implicitly trusted.
607
-
- Requires `BUILD_CONTAINERID`, `BUILD_BUILDID`, and `SYSTEM_TEAMPROJECTID` (all set automatically inside an Azure DevOps pipeline job) and `vso.build_execute` scope on the executor's token (the existing write service connection provides this).
619
+
- Requires `BUILD_CONTAINERID`, `BUILD_BUILDID`, and `SYSTEM_TEAMPROJECTID` (all set automatically inside an Azure DevOps pipeline job) and `vso.build_execute` scope on the executor's token (granted to `$(System.AccessToken)` by default, and to the ARM-minted token when `permissions.write` is set).
608
620
609
621
### cache-memory (moved to `tools:`)
610
622
Memory is now configured as a first-class tool under `tools: cache-memory:` instead of `safe-outputs: memory:`. See the [Cache Memory section](./tools.md#cache-memory-cache-memory) in `docs/tools.md` for details.
Copy file name to clipboardExpand all lines: docs/template-markers.md
+7-6Lines changed: 7 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -532,23 +532,24 @@ If `permissions.read` is not configured, this marker is replaced with an empty s
532
532
533
533
## {{ acquire_write_token }}
534
534
535
-
Generates an `AzureCLI@2` step that acquires a write-capable ADO-scoped access token from the ARM service connection specified in `permissions.write`. This token is used only by the executor in Stage 3 (`SafeOutputs` job) and is never exposed to the agent.
535
+
Generates an `AzureCLI@2` step that acquires a write-capable ADO-scoped access token from the ARM service connection specified in `permissions.write`. When present, this token is used by the executor in Stage 3 (`SafeOutputs` job) instead of the default `$(System.AccessToken)`, and is never exposed to the agent.
536
536
537
537
The step:
538
538
- Uses the ARM service connection from `permissions.write`
539
539
- Calls `az account get-access-token` with the ADO resource ID
540
540
- Stores the token in a secret pipeline variable `SC_WRITE_TOKEN`
541
541
542
-
If `permissions.write` is not configured, this marker is replaced with an empty string.
542
+
If `permissions.write` is not configured (the default), this marker is replaced with an empty string and the executor uses `$(System.AccessToken)` instead — see `{{ executor_ado_env }}` below.
543
543
544
544
## {{ executor_ado_env }}
545
545
546
-
Generates the complete `env:` block (including the `env:` key) for the Stage 3 executor step. The block contains zero, one, or two lines depending on which features are configured:
546
+
Generates the complete `env:` block (including the `env:` key) for the Stage 3 executor step. The block always contains at least `SYSTEM_ACCESSTOKEN` and is **never empty** — the executor always needs a write-capable ADO token to perform safe-output operations.
547
547
548
-
* `SYSTEM_ACCESSTOKEN: $(SC_WRITE_TOKEN)` — emitted when `permissions.write` is configured. Provides the write-capable ADO token to the executor.
549
-
* `ADO_AW_DEBUG_GITHUB_TOKEN: $(ADO_AW_DEBUG_GITHUB_TOKEN)` — emitted when `ado-aw-debug.create-issue` is configured. Provides the GitHub PAT used by the debug-only `create-issue` safe output. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
548
+
* `SYSTEM_ACCESSTOKEN: $(SC_WRITE_TOKEN)` — emitted when `permissions.write` is configured. Sources the executor's token from the ARM-minted write token. Use this for cross-org writes or when you need named-identity attribution.
549
+
* `SYSTEM_ACCESSTOKEN: $(System.AccessToken)` — emitted by default (no `permissions.write` set). Sources the executor's token from the pipeline's built-in OAuth token, scoped by the pipeline's "Limit job authorization scope" settings. This is the *Project Collection Build Service* identity. Sufficient for the vast majority of agents.
550
+
* `ADO_AW_DEBUG_GITHUB_TOKEN: $(ADO_AW_DEBUG_GITHUB_TOKEN)` — additionally emitted when `ado-aw-debug.create-issue` is configured. Provides the GitHub PAT used by the debug-only `create-issue` safe output. See [`docs/ado-aw-debug.md`](ado-aw-debug.md).
550
551
551
-
If neither feature is configured, this marker is replaced with an empty string so that no `env:` block is emitted at all. Note: `System.AccessToken`is never used directly — all ADO tokens come from explicitly configured service connections, and the GitHub PAT is sourced from a dedicated pipeline variable separate from the read-only `GITHUB_TOKEN` the agent sees in Stage 1.
552
+
The agent (Stage 1) never maps `SYSTEM_ACCESSTOKEN` — that is the cross-stage trust boundary that allows the executor to safely receive a write-capable token while the agent stays read-only. (The Setup-job trigger filter gate also maps `SYSTEM_ACCESSTOKEN` for self-cancellation and PR metadata fetching, but that runs before the agent.)
0 commit comments