Skip to content

feat(daily-fact): add reasoning_depth A/B experiment + fix compiler expression transform for inline mode#31330

Merged
pelikhan merged 5 commits into
mainfrom
copilot/ab-advisor-experiment-daily-fact
May 10, 2026
Merged

feat(daily-fact): add reasoning_depth A/B experiment + fix compiler expression transform for inline mode#31330
pelikhan merged 5 commits into
mainfrom
copilot/ab-advisor-experiment-daily-fact

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 10, 2026

Implements the reasoning_depth experiment on daily-fact to test whether a multi-candidate deliberation step produces more novel/engaging verses than the current single-pass approach. Also fixes a latent compiler bug exposed by this workflow's inlined-imports: true setting.

Changes

daily-fact.md — experiment wiring

{{#if experiments.reasoning_depth == "multi_candidate"}}
- **Multi-candidate deliberation**: Before writing, identify exactly **3 distinct candidate facts** ...
  score each 1–5 on novelty and poetic potential, write verse for the highest-scoring one only.
{{else}}
- **Favor recent updates** but include variety - pick something interesting, not just the most recent
{{/if}}

daily-fact.lock.yml — regenerated via gh aw compile daily-fact

Compiler bug fix: expression_extraction.go

transformExperimentsExpression only handled bare ${{ experiments.NAME }}. In workflows with inlined-imports: true the compiler pre-processes all ${{ … }} expressions before runtime; the comparison form ${{ experiments.NAME == "value" }} was left untransformed, producing an invalid GitHub Actions expression where experiments is not a real context — always evaluating to false, silently killing the treatment arm.

Fix: added experimentComparisonRegex to rewrite all four comparison operators (==, ===, !=, !==):

experiments.NAME == "value"  →  steps.pick-experiment.outputs.NAME == "value"

The generated env var now produces a valid GA expression:

# Before (always false — experiments context doesn't exist in GA)
GH_AW_EXPR_70B2292B: ${{ experiments.reasoning_depth == "multi_candidate" }}

# After (evaluates correctly from pick-experiment step output)
GH_AW_EXPR_DD64DF46: ${{ steps.pick-experiment.outputs.reasoning_depth == "multi_candidate" }}

Four new test cases cover ==, ===, !=, !== in expression_extraction_test.go. This bug would have affected any workflow combining inlined-imports: true with value-comparison experiment conditionals.

Copilot AI and others added 3 commits May 10, 2026 12:12
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
… mode (#31324)

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…= test cases

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add A/B test for reasoning depth in daily-fact campaign feat(daily-fact): add reasoning_depth A/B experiment + fix compiler expression transform for inline mode May 10, 2026
Copilot AI requested a review from pelikhan May 10, 2026 12:29
@pelikhan pelikhan marked this pull request as ready for review May 10, 2026 12:33
Copilot AI review requested due to automatic review settings May 10, 2026 12:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an A/B experiment (reasoning_depth) to the daily-fact workflow and fixes inline-mode experiment expression rewriting so variant conditionals evaluate correctly under inlined-imports: true.

Changes:

  • Introduce experiments.reasoning_depth (50/50 single_pass vs multi_candidate) and gate a new “multi-candidate deliberation” guideline via a variant-conditional block in daily-fact.md.
  • Update the compiler’s transformExperimentsExpression to also rewrite experiments.<name> {==,===,!=,!==} "value" comparisons to steps.pick-experiment.outputs.<name> ....
  • Regenerate daily-fact.lock.yml and add tests covering the comparison-operator rewrite cases.
Show a summary per file
File Description
pkg/workflow/expression_extraction.go Extend experiment expression rewriting to handle string comparisons in inline mode.
pkg/workflow/expression_extraction_test.go Add unit tests for ==, ===, !=, !== experiment comparison transforms.
.github/workflows/daily-fact.md Wire reasoning_depth experiment and add variant-conditional prompt guidance.
.github/workflows/daily-fact.lock.yml Regenerated compiled workflow reflecting experiment wiring and fixed expression transform.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment on lines 219 to +224
// transformExperimentsExpression detects expressions of the form "experiments.<name>"
// and rewrites them to "steps.pick-experiment.outputs.<name>" so that the placeholder
// substitution step reads the value from the pick_experiment step output.
// This is used for ${{ experiments.name }} expressions that appear directly in the prompt body
// (mostly relevant in inline mode; in runtime-import mode the template conditional
// {{#if experiments.name}} path is handled separately via ExperimentExpressionMappings).
// (and the comparison form "experiments.<name> == "value"") and rewrites them so that the
// placeholder substitution step reads the value from the pick_experiment step output.
//
// Simple form: experiments.name → steps.pick-experiment.outputs.name
// Comparison form: experiments.name == "v" → steps.pick-experiment.outputs.name == "v"
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 10, 2026

@copilot review all comments

Fixed in the latest commit — switched the nested double-quoted example on line 220 to use a backtick so the rendered comment is unambiguous: `experiments.<name> == "value"`.

@pelikhan pelikhan merged commit e8528ba into main May 10, 2026
@pelikhan pelikhan deleted the copilot/ab-advisor-experiment-daily-fact branch May 10, 2026 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ab-advisor] Experiment campaign for daily-fact: A/B test reasoning_depth

3 participants