feat(daily-fact): add reasoning_depth A/B experiment + fix compiler expression transform for inline mode#31330
Merged
Merged
Conversation
8 tasks
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
… mode (#31324) Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…= test cases Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add A/B test for reasoning depth in daily-fact campaign
feat(daily-fact): add reasoning_depth A/B experiment + fix compiler expression transform for inline mode
May 10, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Adds an A/B experiment (reasoning_depth) to the daily-fact workflow and fixes inline-mode experiment expression rewriting so variant conditionals evaluate correctly under inlined-imports: true.
Changes:
- Introduce
experiments.reasoning_depth(50/50single_passvsmulti_candidate) and gate a new “multi-candidate deliberation” guideline via a variant-conditional block indaily-fact.md. - Update the compiler’s
transformExperimentsExpressionto also rewriteexperiments.<name> {==,===,!=,!==} "value"comparisons tosteps.pick-experiment.outputs.<name> .... - Regenerate
daily-fact.lock.ymland add tests covering the comparison-operator rewrite cases.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/expression_extraction.go | Extend experiment expression rewriting to handle string comparisons in inline mode. |
| pkg/workflow/expression_extraction_test.go | Add unit tests for ==, ===, !=, !== experiment comparison transforms. |
| .github/workflows/daily-fact.md | Wire reasoning_depth experiment and add variant-conditional prompt guidance. |
| .github/workflows/daily-fact.lock.yml | Regenerated compiled workflow reflecting experiment wiring and fixed expression transform. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 4/4 changed files
- Comments generated: 1
Comment on lines
219
to
+224
| // transformExperimentsExpression detects expressions of the form "experiments.<name>" | ||
| // and rewrites them to "steps.pick-experiment.outputs.<name>" so that the placeholder | ||
| // substitution step reads the value from the pick_experiment step output. | ||
| // This is used for ${{ experiments.name }} expressions that appear directly in the prompt body | ||
| // (mostly relevant in inline mode; in runtime-import mode the template conditional | ||
| // {{#if experiments.name}} path is handled separately via ExperimentExpressionMappings). | ||
| // (and the comparison form "experiments.<name> == "value"") and rewrites them so that the | ||
| // placeholder substitution step reads the value from the pick_experiment step output. | ||
| // | ||
| // Simple form: experiments.name → steps.pick-experiment.outputs.name | ||
| // Comparison form: experiments.name == "v" → steps.pick-experiment.outputs.name == "v" |
Collaborator
|
@copilot review all comments |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Contributor
Author
Fixed in the latest commit — switched the nested double-quoted example on line 220 to use a backtick so the rendered comment is unambiguous: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the
reasoning_depthexperiment ondaily-factto test whether a multi-candidate deliberation step produces more novel/engaging verses than the current single-pass approach. Also fixes a latent compiler bug exposed by this workflow'sinlined-imports: truesetting.Changes
daily-fact.md— experiment wiringexperiments.reasoning_depthfrontmatter block (single_passvsmulti_candidate, 50/50, starts 2026-05-11, issue [ab-advisor] Experiment campaign for daily-fact: A/B test reasoning_depth #31324)Favor recent updatesguideline bullet with a variant-conditional block:daily-fact.lock.yml— regenerated viagh aw compile daily-factCompiler bug fix:
expression_extraction.gotransformExperimentsExpressiononly handled bare${{ experiments.NAME }}. In workflows withinlined-imports: truethe compiler pre-processes all${{ … }}expressions before runtime; the comparison form${{ experiments.NAME == "value" }}was left untransformed, producing an invalid GitHub Actions expression whereexperimentsis not a real context — always evaluating tofalse, silently killing the treatment arm.Fix: added
experimentComparisonRegexto rewrite all four comparison operators (==,===,!=,!==):The generated env var now produces a valid GA expression:
Four new test cases cover
==,===,!=,!==inexpression_extraction_test.go. This bug would have affected any workflow combininginlined-imports: truewith value-comparison experiment conditionals.