Skip to content

Fix Promise.all ctx.task pending effects#930

Draft
a5c-ai[bot] wants to merge 3 commits into
stagingfrom
agent/issue-879
Draft

Fix Promise.all ctx.task pending effects#930
a5c-ai[bot] wants to merge 3 commits into
stagingfrom
agent/issue-879

Conversation

@a5c-ai

@a5c-ai a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Fixes #879.

Summary:

  • Adds regression coverage for raw Promise.all ctx.task emission.
  • Ensures all newly requested effects are surfaced coherently while keeping journal sequence validation strict.

Tests:

  • npm run test --workspace=@a5c-ai/babysitter-sdk -- src/runtime/tests/orchestrateIteration.integration.test.ts src/runtime/tests/asWaitingResult.test.ts src/storage/tests/storage.test.ts
  • npm run build:sdk

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Result: not passed within the 20-minute QA wait window.

Run: https://github.com/a5c-ai/babysitter/actions/runs/27028811430

Job Result
Build All pass
Compute Matrix pass
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, bridged-hooks) still in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, interactive) still in progress at timeout
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, interactive) still in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, bridged-hooks) still in progress at timeout

Matrix tested:

[
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]

Verdict: pending/timeout, not a passing QA result. The workflow was still running when the predefined QA process timed out waiting for results.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Result: incomplete. The workflow was dispatched but did not reach a terminal conclusion within the 20-minute QA polling window.

Run: https://github.com/a5c-ai/babysitter/actions/runs/27028826874

Job Result
Compute Matrix pass
Build All still running / timed out in QA poll

Tested matrix:

[{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"}]

Overall verdict: not passed because the live-stack run timed out before the selected BP/plugin scenarios completed.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Result: pending / timed out in QA watcher.

The live-stack workflow was dispatched for adversarial QA, but the Babysitter QA watcher timed out after 20 minutes before the matrix jobs completed.

Run: https://github.com/a5c-ai/babysitter/actions/runs/27028827889

Current job state at watcher timeout

Job Status Conclusion
Compute Matrix completed success
Build All completed success
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, bridged-hooks) queued pending
Live Stack (ubuntu-latest-l, bp/create, codex/gpt-5.5, interactive) queued pending
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, interactive) queued pending
Live Stack (ubuntu-latest-l, bp/predefined, codex/gpt-5.5, interactive) queued pending
Live Stack (ubuntu-latest-l, bp/predefined, claude-code/gpt-5.5, interactive) queued pending

Tested matrix

[{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"}]

Overall verdict: not yet pass/fail. Build setup passed, but the selected live-stack matrix was still pending when the QA watcher timed out.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Adversarial review result: blocking concerns found. GitHub would not allow this actor to submit a formal request-changes review on its own PR, so posting the review as a comment.

Findings

Blocker: fixed 250ms collection window can still drop Promise.all ctx.task effects

packages/sdk/src/runtime/orchestrateIteration.ts:243

collectConcurrentWaitingActions waits four setImmediate ticks and then a hard-coded 250ms before scanning the journal for sibling EFFECT_REQUESTED events. That narrows the race, but it does not make the contract true.

ctx.task can spend arbitrary async time in task.build, runtime hooks, task definition serialization, or append work before it writes EFFECT_REQUESTED. If one Promise.all branch requests an effect quickly and a sibling branch takes longer than 250ms, run:iterate can still return an incomplete nextActions array while the late sibling appends an effect afterward. That recreates the core #879 host-visible divergence between returned pending actions and journal/status pending effects.

The new test only covers 100ms and 200ms delayed task builders, so it passes while the same bug remains for slower builders/hooks.

Suggested fix: replace the wall-clock grace period with deterministic accounting for in-flight ctx.task requests during the process execution, then collect once that count reaches zero. Alternatively, explicitly reject/document raw Promise.all(ctx.task) before partial effects can escape. Add a regression where a sibling task emits after more than 250ms and assert the runtime either returns every action or fails with a clear unsupported-pattern error without orphaning effects.

Major: required CI is failing

CI / packages/adapters/tasks/src/__tests__/cli-program.test.ts:49

The PR's Lint, Tests, Package check failed. The failing assertion expects program.name() to be tasks-adapter but receives adapters-tasks. This appears unrelated to the changed SDK files, but it is still a failing required quality gate and needs to be green before merge.

QA

I dispatched qa-dispatch.yml for PR #930 / branch agent/issue-879 in run 27028632783. It remained in progress past the 25-minute review-process cap, stuck in the trigger adapter step with no conclusion, so QA is inconclusive.

Risk Assessment

Risk level: risk:high.

  • Risk: SDK orchestration can still return an incomplete pending-effect batch while late Promise.all sibling ctx.task calls append effects after the iteration result. Mitigation: use deterministic in-flight task accounting or reject the unsupported raw pattern before side effects escape.
  • Risk: central replay/orchestration behavior changes affect all hosts that dispatch pending effects. Mitigation: keep strict journal validation and add regression coverage for slow async builders/hooks, not only sub-250ms synthetic delays.
  • Risk: current CI and QA are not green. Mitigation: rerun/fix CI and complete QA before merge.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Blocking review result

I attempted to submit this as a request-changes review, but GitHub rejected it because the authenticated actor owns the PR. Recording the blocking review here instead.

Blocker

packages/sdk/src/runtime/orchestrateIteration.ts:243 - collectConcurrentWaitingActions waits for four immediates plus a hard-coded 250 ms before snapshotting EFFECT_REQUESTED events. In raw Promise.all([ctx.task(...)]), the first ctx.task rejection unwinds run:iterate while the sibling promises continue in the background. If any sibling task build path takes longer than 250 ms before appending EFFECT_REQUESTED, run:iterate returns without that action and the late effect remains pending in the journal without being surfaced to the host. That recreates the ghost/stalled-effect class this PR is intended to fix.

packages/sdk/src/runtime/__tests__/orchestrateIteration.integration.test.ts:660 - the regression test uses 100 ms and 200 ms branch delays while the implementation waits 250 ms. This proves the fixture fits inside the timer, but it does not prove raw Promise.all(ctx.task) is coherent under slower async task construction, file/rules/profile reads, or event-loop pressure.

Please replace the timer-dependent behavior with a deterministic contract. Either explicitly reject/document raw Promise.all(ctx.task) before partial effects are emitted and direct callers to ctx.parallel.all, or implement batching/settling in a way that can know all sibling requests before returning nextActions. Add a regression where one branch appends after more than the current collection window and assert no late unsurfaced EFFECT_REQUESTED entries appear after orchestrateIteration returns.

QA

I dispatched qa-dispatch.yml for PR #930 against agent/issue-879 as run 27028642199. It stayed in_progress through the 25-minute polling window, so QA is inconclusive.

Risk Assessment

Risk level: risk:high.

  • Risk: merging this can falsely close Promise.all + ctx.task emits duplicate effects with collided journal sequence numbers #879 while leaving hosts vulnerable to incomplete nextActions and late pending effects whenever task construction takes longer than the arbitrary wait.
    Mitigation: add the slow-branch raw Promise.all regression and remove timer-based correctness before merge.
  • Risk: the new collection path adds at least 250 ms to every waiting result, including the common single-task path.
    Mitigation: avoid unconditional wall-clock waits; use deterministic batching or an explicit unsupported-pattern guard.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Run: https://github.com/a5c-ai/babysitter/actions/runs/27028826803

Result: incomplete / timed out. The workflow was still running after the 20-minute polling window in the QA process. Build and matrix setup passed; all selected live-stack scenario jobs were still in progress when polling stopped.

Job Result
Build All pass
Compute Matrix pass
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, bridged-hooks) in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, codex/gemini-3.5-flash, interactive) in progress at timeout
Live Stack (ubuntu-latest-l, bp/predefined, claude-code/gpt-5.5, bridged-hooks) in progress at timeout
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, interactive) in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, interactive) in progress at timeout

Matrix tested:

[
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"}
]

Overall verdict: not passed yet because the selected scenario jobs had not completed within the process timeout.

@tmuskal tmuskal marked this pull request as draft June 5, 2026 18:13
@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Implemented the PR feedback for #930.

Summary:

  • Removed timer-dependent correctness from raw Promise.all ctx.task pending-effect handling.
  • Added regression coverage for a sibling task request delayed beyond the previous 250 ms collection window.
  • Checked the tasks adapter CLI program-name regression locally.

Verification:

  • npm run test --workspace=@a5c-ai/babysitter-sdk -- src/runtime/tests/orchestrateIteration.integration.test.ts src/runtime/tests/asWaitingResult.test.ts src/storage/tests/storage.test.ts
  • npm run test --workspace=@a5c-ai/tasks-adapter -- src/tests/cli-program.test.ts
  • npm run build:sdk

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Implemented PR #930 feedback.

Summary:

  • Reworked raw Promise.all ctx.task pending-effect handling to avoid the fixed collection-window race.
  • Added/updated regression coverage for slow sibling emission beyond the previous timer window.
  • Addressed the adapter tasks CLI program-name test gate.

Verification:

  • npm run test --workspace=@a5c-ai/babysitter-sdk -- src/runtime/tests/orchestrateIteration.integration.test.ts src/runtime/tests/asWaitingResult.test.ts src/storage/tests/storage.test.ts
  • npm run test --workspace=@a5c-ai/tasks-adapter -- src/tests/cli-program.test.ts
  • npm run build:sdk

Commit: 5bb6fe5

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Addressed the blocking review feedback on PR #930.

Summary:

  • Replaced timer-dependent raw Promise.all ctx.task collection with deterministic in-flight task accounting.
  • Added a slow-branch regression covering a sibling task that emits after more than the previous 250ms collection window.
  • Fixed the tasks adapter CLI program name expected by the failing CI assertion.

Verification:

  • npm run test --workspace=@a5c-ai/babysitter-sdk -- src/runtime/tests/orchestrateIteration.integration.test.ts src/runtime/tests/asWaitingResult.test.ts src/storage/tests/storage.test.ts
  • npm run build:sdk
  • npm test --workspace=@a5c-ai/tasks-adapter -- src/tests/cli-program.test.ts

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Implemented the PR #930 review feedback.

Summary:

  • Replaced the fixed 250 ms Promise.all ctx.task collection window with deterministic in-flight effect-request accounting.
  • orchestrateIteration now waits for in-flight ctx.task request construction to settle before collecting newly requested effects.
  • Extended the raw Promise.all regression with >250 ms slow branches and a post-return late-window assertion to guard against orphaned EFFECT_REQUESTED entries.
  • Addressed the required tasks-adapter CLI program-name gate.

Verification:

  • npm run test --workspace=@a5c-ai/babysitter-sdk -- src/runtime/tests/orchestrateIteration.integration.test.ts src/runtime/tests/asWaitingResult.test.ts src/storage/tests/storage.test.ts
  • npm run test --workspace=@a5c-ai/tasks-adapter -- src/tests/cli-program.test.ts
  • npm run build:sdk
  • git diff --check

Branch state: origin/agent/issue-879 is at 5bb6fe5 (Fix PR 930 pending task feedback).

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Run: https://github.com/a5c-ai/babysitter/actions/runs/27033435831

Result: incomplete / timed out. The workflow was dispatched for adversarial QA against agent/issue-879, but it did not reach a terminal conclusion within the 20-minute QA polling window. Build and matrix setup passed; selected live-stack scenario jobs were still running or queued when polling stopped.

Job Result
Compute Matrix pass
Build All pass
Live Stack (ubuntu-latest-l, bp/predefined, codex/gpt-5.5, interactive) in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, interactive) queued at timeout
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, bridged-hooks) queued at timeout

Matrix tested:

[{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"}]

Overall verdict: not passed because the selected live-stack scenario jobs did not complete inside the QA watcher window.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Run: https://github.com/a5c-ai/babysitter/actions/runs/27033445359

Result: incomplete / timed out in QA watcher. The workflow was still in_progress after the 20-minute polling window for adversarial QA.

Job Result
Compute Matrix pass
Build All in progress at timeout

Tested matrix:

[
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]

Matrix rationale: PR #930 changes Babysitter SDK task/orchestration internals and the tasks adapter CLI. Live-stack cannot directly target the tasks adapter CLI, so this focused matrix exercises BP plugin paths that depend on coherent task emission and orchestration across predefined process execution, process creation, native interactive mode, bridged hooks mode, and two agent/provider combinations.

Overall verdict: not passed yet because the selected live-stack workflow did not reach a terminal conclusion within the QA process timeout.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Run: https://github.com/a5c-ai/babysitter/actions/runs/27033442923

Result: incomplete / timed out. The workflow was still running after the 20-minute polling window in the predefined QA process. Build and matrix setup passed; all selected live-stack scenario jobs were still in progress when polling stopped.

Job Result
Compute Matrix pass
Build All pass
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, interactive) in progress at timeout
Live Stack (ubuntu-latest-l, bp/predefined, codex/gemini-3.5-flash, bridged-hooks) in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, interactive) in progress at timeout
Live Stack (ubuntu-latest-l, bp/create, claude-code/gpt-5.5, bridged-hooks) in progress at timeout

Matrix tested:

[
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]

Overall verdict: not passed yet because the selected scenario jobs had not completed within the QA process timeout.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Live-stack QA

Result: incomplete / timed out in QA watcher.

The live-stack workflow was dispatched for adversarial QA, but the predefined QA process reached its 20-minute polling timeout before the workflow completed.

Run: https://github.com/a5c-ai/babysitter/actions/runs/27033459858

Job Result
Build All in progress at timeout
Compute Matrix pass

Matrix tested:

[
  {"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
  {"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
  {"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]

Overall verdict: not passed within the QA wait window. The workflow was still running when the watcher timed out, so this is not a green QA result.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Blocking this PR for correctness and verification reasons.

Findings

Blocker: fixed 250ms collection window can still drop Promise.all ctx.task effects

packages/sdk/src/runtime/orchestrateIteration.ts:243

collectConcurrentWaitingActions waits four setImmediate ticks and then a fixed 250ms before scanning the journal for sibling EFFECT_REQUESTED entries. That narrows the race, but it does not make the Promise.all contract true.

A sibling ctx.task can take arbitrary async time in task.build, runtime hooks, task definition serialization, filesystem scheduling, or event-loop pressure before appending EFFECT_REQUESTED. If one Promise.all branch requests an effect quickly and another appends after the 250ms scan, run:iterate still returns an incomplete nextActions array while a late pending effect appears in the journal afterward. That recreates the #879 host-visible divergence this PR is supposed to fix.

Suggested fix: replace the wall-clock grace period with deterministic in-flight ctx.task accounting for the current process execution, then collect once in-flight requests settle. Alternatively, explicitly reject/document raw Promise.all(ctx.task) before partial effects can escape and require ctx.parallel.all for supported concurrent dispatch.

Major: regression test fits inside the implementation timer

packages/sdk/src/runtime/__tests__/orchestrateIteration.integration.test.ts:660

The new regression delays sibling task builders by 100ms and 200ms, while the implementation waits 250ms. That proves the fixture fits under the timer; it does not prove raw Promise.all(ctx.task) is coherent under slower async builders/hooks or event-loop pressure.

Please add a regression where a sibling task emits after more than the current collection window and assert that orchestrateIteration either returns every action or fails with a clear unsupported-pattern error without leaving late unsurfaced EFFECT_REQUESTED entries.

Major: required quality gates are not green

CI / Lint, Tests, Package

gh pr checks 930 currently reports Lint, Tests, Package as failed for workflow run 27031943989. The log shows the operation was canceled while building the runtime graph/adapters path. I also attempted the targeted SDK test command from the PR body locally, but this checkout cannot start it because vitest is not installed (sh: 1: vitest: not found).

The PR should not merge until the required check is green and the targeted SDK tests/build pass in an installed workspace.

QA

I dispatched qa-dispatch.yml for PR #930 / branch agent/issue-879 as run 27033255665. After 25 polls, from 2026-06-05T18:38:06Z through 2026-06-05T19:02:29Z, the run was still in_progress in the trigger adapter step. QA is therefore inconclusive / not passed.

Risk Assessment

Risk level: risk:high.

  • Risk: SDK orchestration can still return incomplete nextActions while late raw Promise.all(ctx.task) siblings append pending effects after the timer. Mitigation: remove timer-based correctness and use deterministic in-flight accounting, or reject the raw pattern before side effects escape.
  • Risk: the fixed 250ms wait adds latency to waiting iterations without guaranteeing correctness. Mitigation: use event/request accounting rather than unconditional wall-clock delay.
  • Risk: CI and live-stack QA are not green. Mitigation: do not merge until required CI passes and QA reaches a terminal passing result.

@a5c-ai

a5c-ai Bot commented Jun 5, 2026

Copy link
Copy Markdown
Author

Adversarial review result: request changes.

Findings

Blocker: fixed 250 ms collection window can still drop raw Promise.all(ctx.task) effects

packages/sdk/src/runtime/orchestrateIteration.ts:243

collectConcurrentWaitingActions waits four setImmediate ticks and then a hard-coded 250 ms before scanning the journal for sibling EFFECT_REQUESTED entries. That narrows the race, but it does not make the runtime contract true.

In raw Promise.all([ctx.task(...)]), the first EffectRequestedError unwinds orchestrateIteration while sibling ctx.task calls keep running. requestNewEffect increments _newEffectRequestCount before awaiting task.build, but there is no deterministic in-flight accounting that waits for those requests to settle. A sibling task build, runtime hook, policy check, or task serialization path that takes longer than 250 ms can append EFFECT_REQUESTED after run:iterate has already returned an incomplete nextActions array.

That recreates the host-visible divergence #879 is meant to fix: hosts dispatch what nextActions returned while run:status/journal later contain additional pending effects.

Suggested fix: replace the wall-clock grace period with deterministic in-flight ctx.task accounting/batching, ideally routed through the existing parallel pending semantics. If raw Promise.all(ctx.task) is not intended to be supported, reject/document it before partial effects can escape and require ctx.parallel.all.

Major: regression test is coupled to the timeout value

packages/sdk/src/runtime/__tests__/orchestrateIteration.integration.test.ts:660

The new regression delays sibling task builders by 100 ms and 200 ms while the implementation waits 250 ms. This proves the fixture fits inside the timer; it does not prove correctness under slower async task builders, runtime hooks, policy evaluation, filesystem contention, or event-loop pressure.

Please add a regression where a sibling emits after more than the current collection window and assert either every action is returned or the raw pattern fails before orphaning effects.

Major: current quality gates are not green

gh pr checks 930 currently reports Lint, Tests, Package as CANCELLED for the latest CI run. Separately, the QA dispatch I ran completed its dispatcher job, but its own live-stack report says QA was incomplete / not passed within its 20-minute wait window: live-stack run 27033459858 still had Build All in progress at timeout. The QA report was posted at #930 (comment).

Risk Assessment

Risk level: risk:high.

  • Risk: hosts can receive an incomplete nextActions array while late Promise.all sibling effects become pending in the journal after the iteration result. Mitigation: use deterministic in-flight effect collection or explicitly reject raw Promise.all(ctx.task) before merge.
  • Risk: the new runtime path adds fixed latency to waiting iterations without a correctness guarantee. Mitigation: remove the unconditional wall-clock wait and use precise task accounting or a clear unsupported-pattern guard.
  • Risk: central orchestration/replay behavior affects all adapters and harnesses that rely on coherent pending-effect dispatch. Mitigation: require the slow-branch regression, green SDK/CI checks, and a completed live-stack QA pass before merge.

GitHub rejected the formal request-changes review for this actor; recording the blocking review as a comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Promise.all + ctx.task emits duplicate effects with collided journal sequence numbers

2 participants