Summary
After upgrading from gh-aw v0.68.3 to v0.71.1 (and then v0.71.3), the number of Copilot premium requests billed per workflow run increased by 10-100x. A run that previously consumed 1-2 premium requests now consumes 50-100+, because GitHub appears to be billing every API call (both user-initiated and agent response turns) as a premium request, instead of only billing user-initiated turns.
Root cause
The v0.71 compiler injects a new environment variable into the lock file that was not present in v0.68:
COPILOT_API_KEY: dummy-byok-key-for-offline-mode
This activates the BYOK detection path in the Copilot CLI. In v0.68, this variable did not exist — the CLI authenticated normally through the api-proxy sidecar without triggering BYOK mode.
v0.68.3 lock file (working correctly)
copilot_driver.cjs /usr/local/bin/copilot --autopilot ...
# env:
COPILOT_MODEL: claude-sonnet-4
# No COPILOT_API_KEY
v0.71.3 lock file (over-billing)
copilot_harness.cjs /usr/local/bin/copilot --autopilot ...
# env:
COPILOT_API_KEY: dummy-byok-key-for-offline-mode
COPILOT_MODEL: claude-sonnet-4.6
Billing data
Correlated GitHub billing CSVs (premiumRequestUsageReport) with per-run artifact data (counting X-Initiator: user headers in API logs):
| Date |
Billed premium requests |
Mined total API calls |
Mined user-initiated only |
Billed / Total ratio |
| Apr 29 (v0.68.3) |
22 |
327 |
~23 |
0.07 |
| May 3 (v0.71.3) |
325 |
318 |
~16 |
1.02 |
| May 4 (v0.71.3) |
1,610 |
1,705 |
~77 |
0.94 |
In v0.68, only ~7% of API calls were billed (matching user-initiated turns). In v0.71, ~94-102% of ALL API calls are billed — both user-initiated turns AND agent response turns.
Expected behavior
The cost management docs state: "A typical workflow run uses 1–2 premium requests." This was accurate in v0.68 but is wildly off in v0.71. The copilot engine should bill premium requests the same way regardless of whether copilot_driver.cjs or copilot_harness.cjs is used as the launcher.
Impact
For a moderately active repository running ~20-30 workflow runs per day, this regression causes:
- v0.68:
30-60 premium requests/day ($1.20-$2.40/day at $0.04/req)
- v0.71:
1,500-2,000 premium requests/day ($60-$80/day at $0.04/req)
This is a 30-50x cost increase with no change to the workflow source files.
Environment
- gh-aw versions tested: v0.68.3 (correct billing), v0.71.1 and v0.71.3 (over-billing)
- Copilot CLI: 1.0.21 (v0.68) → 1.0.35/1.0.36 (v0.71)
- Engine:
copilot with model: claude-sonnet-4 / claude-sonnet-4.6
- AWF firewall: 0.25.20 (v0.68) → 0.25.28/0.25.29 (v0.71)
Workaround
None known within the copilot engine. The .md source files cannot control whether COPILOT_API_KEY is injected — it is added by the compiler.
Steps to reproduce
- Compile any workflow with
engine.id: copilot using gh-aw v0.71.x
- Run the workflow
- Check the GitHub billing CSV (
premiumRequestUsageReport) for the COPILOT_GITHUB_TOKEN owner
- Compare billed premium requests to the number of user-initiated API calls in the run artifacts
- Observe that ALL API calls (user + agent) are billed, not just user-initiated ones
Workaround test: removing the dummy key breaks MCP servers
Tested removing COPILOT_API_KEY: dummy-byok-key-for-offline-mode from lock files post-compile. Premium request billing returned to normal (2 per run), but all custom MCP servers were blocked by policy:
! 3 MCP servers were blocked by policy
The dummy key is required for the AWF BYOK runtime path that allows custom MCP servers. Without it, the Copilot CLI falls back to standard mode where enterprise/org policy blocks non-built-in MCP servers.
There is no user-side workaround — the BYOK runtime path (needed for MCP) and the billing behavior change (all calls counted as premium) are coupled.
Related: #28470 — likely the same root cause, misdiagnosed
#28470 ("See large increase in runtime cost with 0.71.0") reported an identical 20-30x cost spike after v0.71. The Q audit investigation attributed it to:
- Detection job now functional (was crashing in v0.68 due to
node: command not found) → 3x more jobs completing
- 67% more turns in the agent job due to blocked network requests
The investigation concluded: "the increased cost reflects what the workflow was supposed to cost all along."
However, the audit focused on job count and turn count — it never examined how premium requests are metered per API call. Our billing CSV correlation data shows the per-call billing ratio changed from 0.07 → ~1.0 between v0.68 and v0.71. A 3x from detection fix is real but minor; the 14x change in per-call billing ratio from the BYOK dummy key is the dominant factor.
Billing ratio: v0.68 vs v0.71
| Date |
Version |
Mined API calls |
GitHub-billed premium reqs |
Ratio |
| Apr 27 |
v0.68 |
475 |
31 |
0.07 |
| Apr 28 |
v0.68 |
174 |
14 |
0.08 |
| Apr 29 |
v0.68 |
327 |
22 |
0.07 |
| Apr 30 |
v0.71 |
505 |
377 |
0.75 |
| May 2 |
v0.71 |
48 |
68 |
1.42 |
| May 3 |
v0.71 |
318 |
325 |
1.02 |
| May 4 |
v0.71 |
1,705 |
1,610 |
0.94 |

Summary
After upgrading from gh-aw v0.68.3 to v0.71.1 (and then v0.71.3), the number of Copilot premium requests billed per workflow run increased by 10-100x. A run that previously consumed 1-2 premium requests now consumes 50-100+, because GitHub appears to be billing every API call (both user-initiated and agent response turns) as a premium request, instead of only billing user-initiated turns.
Root cause
The v0.71 compiler injects a new environment variable into the lock file that was not present in v0.68:
This activates the BYOK detection path in the Copilot CLI. In v0.68, this variable did not exist — the CLI authenticated normally through the api-proxy sidecar without triggering BYOK mode.
v0.68.3 lock file (working correctly)
v0.71.3 lock file (over-billing)
Billing data
Correlated GitHub billing CSVs (
premiumRequestUsageReport) with per-run artifact data (countingX-Initiator: userheaders in API logs):In v0.68, only ~7% of API calls were billed (matching user-initiated turns). In v0.71, ~94-102% of ALL API calls are billed — both user-initiated turns AND agent response turns.
Expected behavior
The cost management docs state: "A typical workflow run uses 1–2 premium requests." This was accurate in v0.68 but is wildly off in v0.71. The
copilotengine should bill premium requests the same way regardless of whethercopilot_driver.cjsorcopilot_harness.cjsis used as the launcher.Impact
For a moderately active repository running ~20-30 workflow runs per day, this regression causes:
30-60 premium requests/day ($1.20-$2.40/day at $0.04/req)1,500-2,000 premium requests/day ($60-$80/day at $0.04/req)This is a 30-50x cost increase with no change to the workflow source files.
Environment
copilotwithmodel: claude-sonnet-4/claude-sonnet-4.6Workaround
None known within the
copilotengine. The.mdsource files cannot control whetherCOPILOT_API_KEYis injected — it is added by the compiler.Steps to reproduce
engine.id: copilotusing gh-aw v0.71.xpremiumRequestUsageReport) for the COPILOT_GITHUB_TOKEN ownerWorkaround test: removing the dummy key breaks MCP servers
Tested removing
COPILOT_API_KEY: dummy-byok-key-for-offline-modefrom lock files post-compile. Premium request billing returned to normal (2 per run), but all custom MCP servers were blocked by policy:The dummy key is required for the AWF BYOK runtime path that allows custom MCP servers. Without it, the Copilot CLI falls back to standard mode where enterprise/org policy blocks non-built-in MCP servers.
There is no user-side workaround — the BYOK runtime path (needed for MCP) and the billing behavior change (all calls counted as premium) are coupled.
Related: #28470 — likely the same root cause, misdiagnosed
#28470 ("See large increase in runtime cost with 0.71.0") reported an identical 20-30x cost spike after v0.71. The Q audit investigation attributed it to:
node: command not found) → 3x more jobs completingThe investigation concluded: "the increased cost reflects what the workflow was supposed to cost all along."
However, the audit focused on job count and turn count — it never examined how premium requests are metered per API call. Our billing CSV correlation data shows the per-call billing ratio changed from 0.07 → ~1.0 between v0.68 and v0.71. A 3x from detection fix is real but minor; the 14x change in per-call billing ratio from the BYOK dummy key is the dominant factor.
Billing ratio: v0.68 vs v0.71