fix(flue): switch kimi automations to k2.7-code and handle 429 capacity gracefully by ascorbic · Pull Request #1490 · emdash-cms/emdash

ascorbic · 2026-06-15T15:57:46Z

What does this PR do?

Fixes the regression where the automated PR reviewer (emdash-flue-review) stopped posting reviews around #1484, with no deploy and no error in logs.

Root cause: Workers AI returns HTTP 429 when a model is over capacity. We confirmed sustained 429s on @cf/moonshotai/kimi-k2.6 (and, to a lesser extent, 2.7). Under load the AI binding can hold a request open indefinitely, so the review workflow's session.skill call never returned: no result, no posted review, just the Sandbox container's keep-alive alarm firing for minutes. There was no timeout or capacity handling anywhere, so a transient capacity spike turned into a permanent silent hang.

Changes:

Move every kimi usage to kimi-k2.7-code (less loaded right now): the review agent, the fix agent, both reply classifiers, and the investigate classifier. (/bonk//review kimi alias already moved in chore: bump flue/bonk coding agents to kimi-k2.7-code #1485.)
Add withCapacityRetry (one copy per flue deploy unit, since .flue and infra/flue-review are independent workspaces): bounds each model call with a hard per-attempt timeout so a stalled call fails loudly instead of hanging, and retries genuine 429 capacity errors with exponential backoff + full jitter. Per-attempt timeouts are intentionally not retried (can't distinguish a stall from slow-but-working progress) — they fail loud and bounded, and the workflow's at-least-once restart handles re-running.
Apply it to the flue-review review skill and to every model-bearing stage of the investigate/classify workflows.

Note: ModelConfig in @flue/runtime is just a model-id string, so there's no provider-level retry/timeout knob — this is the application-level contract.

Verified by live wrangler tail of emdash-flue-review: the pipeline is healthy through git checkout and git diff, then goes silent at the (kimi) inference call with zero exceptions — consistent with a held-open 429.

Closes #

Type of change

Checklist

I have read CONTRIBUTING.md
pnpm typecheck passes
pnpm lint passes
pnpm test passes (or targeted tests for my change)
pnpm format has been run
I have added/updated tests for my changes (if applicable)
User-visible strings in the admin UI are wrapped for translation (if applicable)
I have added a changeset (if this PR changes a published package)
New features link to an approved Discussion

AI-generated code disclosure

This PR includes AI-generated code — model/tool: Claude Opus 4.8 (opencode)

Screenshots / test output

tsc --noEmit clean in infra/flue-review (after wrangler types) and .flue; oxfmt clean on all changed files.

…city gracefully Workers AI returns 429 when a model is over capacity, and under sustained load the binding can hold a request open indefinitely. That left the deployed review workflow hung forever on a stalled inference call: no result, no posted review, just the container's keep-alive alarm firing for minutes (the cause of reviews silently not posting). - Switch every kimi usage (review, fix, both classifiers, investigate classifier) to kimi-k2.7-code, which is currently less loaded. - Add withCapacityRetry: bounds each model call with a hard per-attempt timeout (fails loudly instead of hanging) and retries genuine 429 capacity errors with exponential backoff + full jitter. - Apply it to the flue-review skill call and to all model-bearing stages of the investigate/classify workflows.

changeset-bot · 2026-06-15T15:57:54Z

⚠️ No Changeset found

Latest commit: d5fe0ff

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

github-actions · 2026-06-15T15:58:58Z

Scope check

This PR changes 553 lines across 8 files. Large PRs are harder to review and more likely to be closed without review.

If this scope is intentional, no action needed. A maintainer will review it. If not, please consider splitting this into smaller PRs.

See CONTRIBUTING.md for contribution guidelines.

pkg-pr-new · 2026-06-15T16:01:01Z

Open in StackBlitz

@emdash-cms/admin

npm i https://pkg.pr.new/@emdash-cms/admin@1490

@emdash-cms/auth

npm i https://pkg.pr.new/@emdash-cms/auth@1490

@emdash-cms/auth-atproto

npm i https://pkg.pr.new/@emdash-cms/auth-atproto@1490

@emdash-cms/blocks

npm i https://pkg.pr.new/@emdash-cms/blocks@1490

@emdash-cms/cloudflare

npm i https://pkg.pr.new/@emdash-cms/cloudflare@1490

@emdash-cms/contentful-to-portable-text

npm i https://pkg.pr.new/@emdash-cms/contentful-to-portable-text@1490

emdash

npm i https://pkg.pr.new/emdash@1490

create-emdash

npm i https://pkg.pr.new/create-emdash@1490

@emdash-cms/gutenberg-to-portable-text

npm i https://pkg.pr.new/@emdash-cms/gutenberg-to-portable-text@1490

@emdash-cms/plugin-cli

npm i https://pkg.pr.new/@emdash-cms/plugin-cli@1490

@emdash-cms/plugin-types

npm i https://pkg.pr.new/@emdash-cms/plugin-types@1490

@emdash-cms/registry-client

npm i https://pkg.pr.new/@emdash-cms/registry-client@1490

@emdash-cms/registry-lexicons

npm i https://pkg.pr.new/@emdash-cms/registry-lexicons@1490

@emdash-cms/sandbox-workerd

npm i https://pkg.pr.new/@emdash-cms/sandbox-workerd@1490

@emdash-cms/x402

npm i https://pkg.pr.new/@emdash-cms/x402@1490

@emdash-cms/plugin-ai-moderation

npm i https://pkg.pr.new/@emdash-cms/plugin-ai-moderation@1490

@emdash-cms/plugin-atproto

npm i https://pkg.pr.new/@emdash-cms/plugin-atproto@1490

@emdash-cms/plugin-audit-log

npm i https://pkg.pr.new/@emdash-cms/plugin-audit-log@1490

@emdash-cms/plugin-color

npm i https://pkg.pr.new/@emdash-cms/plugin-color@1490

@emdash-cms/plugin-embeds

npm i https://pkg.pr.new/@emdash-cms/plugin-embeds@1490

@emdash-cms/plugin-field-kit

npm i https://pkg.pr.new/@emdash-cms/plugin-field-kit@1490

@emdash-cms/plugin-forms

npm i https://pkg.pr.new/@emdash-cms/plugin-forms@1490

@emdash-cms/plugin-webhook-notifier

npm i https://pkg.pr.new/@emdash-cms/plugin-webhook-notifier@1490

commit: d5fe0ff

github-actions Bot added the review/needs-review No maintainer or bot review yet label Jun 15, 2026

github-actions Bot added the size/XL label Jun 15, 2026

ascorbic merged commit eb5f7e4 into main Jun 15, 2026
45 checks passed

ascorbic deleted the fix/flue-kimi-429-handling branch June 15, 2026 16:15

ascorbic mentioned this pull request Jun 16, 2026

chore(flue): upgrade @flue to 0.11.1 and bump wrangler so the reviewer deploys #1501

Merged

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(flue): switch kimi automations to k2.7-code and handle 429 capacity gracefully#1490

fix(flue): switch kimi automations to k2.7-code and handle 429 capacity gracefully#1490
ascorbic merged 1 commit into
mainfrom
fix/flue-kimi-429-handling

ascorbic commented Jun 15, 2026

Uh oh!

changeset-bot Bot commented Jun 15, 2026

Uh oh!

github-actions Bot commented Jun 15, 2026

Uh oh!

pkg-pr-new Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ascorbic commented Jun 15, 2026

What does this PR do?

Type of change

Checklist

AI-generated code disclosure

Screenshots / test output

Uh oh!

changeset-bot Bot commented Jun 15, 2026

⚠️ No Changeset found

Uh oh!

github-actions Bot commented Jun 15, 2026

Scope check

Uh oh!

pkg-pr-new Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant