Fix Promise.all ctx.task pending effects#930
Conversation
Live-stack QAResult: not passed within the 20-minute QA wait window. Run: https://github.com/a5c-ai/babysitter/actions/runs/27028811430
Matrix tested: [
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]Verdict: pending/timeout, not a passing QA result. The workflow was still running when the predefined QA process timed out waiting for results. |
Live-stack QAResult: incomplete. The workflow was dispatched but did not reach a terminal conclusion within the 20-minute QA polling window. Run: https://github.com/a5c-ai/babysitter/actions/runs/27028826874
Tested matrix: [{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"}]Overall verdict: not passed because the live-stack run timed out before the selected BP/plugin scenarios completed. |
Live-stack QAResult: pending / timed out in QA watcher. The live-stack workflow was dispatched for adversarial QA, but the Babysitter QA watcher timed out after 20 minutes before the matrix jobs completed. Run: https://github.com/a5c-ai/babysitter/actions/runs/27028827889 Current job state at watcher timeout
Tested matrix[{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"}]Overall verdict: not yet pass/fail. Build setup passed, but the selected live-stack matrix was still pending when the QA watcher timed out. |
|
Adversarial review result: blocking concerns found. GitHub would not allow this actor to submit a formal request-changes review on its own PR, so posting the review as a comment. FindingsBlocker: fixed 250ms collection window can still drop Promise.all ctx.task effects
The new test only covers 100ms and 200ms delayed task builders, so it passes while the same bug remains for slower builders/hooks. Suggested fix: replace the wall-clock grace period with deterministic accounting for in-flight Major: required CI is failing
The PR's QAI dispatched Risk AssessmentRisk level:
|
Blocking review resultI attempted to submit this as a request-changes review, but GitHub rejected it because the authenticated actor owns the PR. Recording the blocking review here instead. Blocker
Please replace the timer-dependent behavior with a deterministic contract. Either explicitly reject/document raw QAI dispatched Risk AssessmentRisk level:
|
Live-stack QARun: https://github.com/a5c-ai/babysitter/actions/runs/27028826803 Result: incomplete / timed out. The workflow was still running after the 20-minute polling window in the QA process. Build and matrix setup passed; all selected live-stack scenario jobs were still in progress when polling stopped.
Matrix tested: [
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"}
]Overall verdict: not passed yet because the selected scenario jobs had not completed within the process timeout. |
|
Implemented the PR feedback for #930. Summary:
Verification:
|
|
Implemented PR #930 feedback. Summary:
Verification:
Commit: 5bb6fe5 |
|
Addressed the blocking review feedback on PR #930. Summary:
Verification:
|
|
Implemented the PR #930 review feedback. Summary:
Verification:
Branch state: origin/agent/issue-879 is at 5bb6fe5 (Fix PR 930 pending task feedback). |
Live-stack QARun: https://github.com/a5c-ai/babysitter/actions/runs/27033435831 Result: incomplete / timed out. The workflow was dispatched for adversarial QA against
Matrix tested: [{"agent":"codex","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"}]Overall verdict: not passed because the selected live-stack scenario jobs did not complete inside the QA watcher window. |
Live-stack QARun: https://github.com/a5c-ai/babysitter/actions/runs/27033445359 Result: incomplete / timed out in QA watcher. The workflow was still
Tested matrix: [
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]Matrix rationale: PR #930 changes Babysitter SDK task/orchestration internals and the tasks adapter CLI. Live-stack cannot directly target the tasks adapter CLI, so this focused matrix exercises BP plugin paths that depend on coherent task emission and orchestration across predefined process execution, process creation, native interactive mode, bridged hooks mode, and two agent/provider combinations. Overall verdict: not passed yet because the selected live-stack workflow did not reach a terminal conclusion within the QA process timeout. |
Live-stack QARun: https://github.com/a5c-ai/babysitter/actions/runs/27033442923 Result: incomplete / timed out. The workflow was still running after the 20-minute polling window in the predefined QA process. Build and matrix setup passed; all selected live-stack scenario jobs were still in progress when polling stopped.
Matrix tested: [
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]Overall verdict: not passed yet because the selected scenario jobs had not completed within the QA process timeout. |
Live-stack QAResult: incomplete / timed out in QA watcher. The live-stack workflow was dispatched for adversarial QA, but the predefined QA process reached its 20-minute polling timeout before the workflow completed. Run: https://github.com/a5c-ai/babysitter/actions/runs/27033459858
Matrix tested: [
{"agent":"codex","model":"google-gemini31","mode":"interactive","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"codex","model":"google-gemini31","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"predefined"},
{"agent":"claude","model":"foundry-gpt55","mode":"interactive","install":"bp","live":true,"process_mode":"create"},
{"agent":"claude","model":"foundry-gpt55","mode":"bridged-hooks","install":"bp","live":true,"process_mode":"create"}
]Overall verdict: not passed within the QA wait window. The workflow was still running when the watcher timed out, so this is not a green QA result. |
|
Blocking this PR for correctness and verification reasons. FindingsBlocker: fixed 250ms collection window can still drop Promise.all ctx.task effects
A sibling Suggested fix: replace the wall-clock grace period with deterministic in-flight Major: regression test fits inside the implementation timer
The new regression delays sibling task builders by 100ms and 200ms, while the implementation waits 250ms. That proves the fixture fits under the timer; it does not prove raw Please add a regression where a sibling task emits after more than the current collection window and assert that Major: required quality gates are not green
The PR should not merge until the required check is green and the targeted SDK tests/build pass in an installed workspace. QAI dispatched Risk AssessmentRisk level:
|
|
Adversarial review result: request changes. FindingsBlocker: fixed 250 ms collection window can still drop raw
|
Fixes #879.
Summary:
Tests: