Fix runtime todo rendering and artifact echoes#4018
Conversation
|
Hey @Siri-Ray! 👋 Triage complete — labeled The regression scope is clear from the body: inline TodoWrite position restoration + artifact-echo suppression + Gemini native-only policy. The test coverage across both daemon and web is exactly the right shape for a fix like this. CI is still pending — pool review will kick off once it's green. 💡 This one touches daemon event streaming and web rendering together, so there may be a couple of review rounds. If you'd like to drive it to merge hands-free, paste this to your AI coding agent (Claude Code / Codex / opencode / Cursor …): |
nettee
left a comment
There was a problem hiding this comment.
I found a few blocking regressions in the new todo/artifact handling. The biggest ones are that the inline todo card now depends on the original message row staying mounted, and the stream parsers still leak provider-native todo/task tools alongside the canonical TodoWrite events.
Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
nettee
left a comment
There was a problem hiding this comment.
Three blocking regressions are still present on the current head: the canonical todo card can disappear in long conversations, the daemon still emits duplicate provider-native todo/task tool events alongside the normalized TodoWrite event, and artifact suppression now stays enabled for the rest of the run after the first file write. The inline comments below call out the concrete failure modes and the specific test coverage gaps to close.
|
@Siri-Ray — @nettee reviewed and flagged three blocking issues:
Looks like there's a new commit up ( |
|
Looper fixer round complete — cca2dba
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
I found one remaining blocking regression in the artifact-echo suppression path. The inline comment below covers the concrete failure mode and the regression coverage that still needs to be added.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — 3e39e0f
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
Two blocking regressions remain on the current head: the standalone TodoWrite row is appended at the end of the transcript instead of the first TodoWrite position, and the artifact-echo filter now withholds ordinary post-write prose until flush/end when no <artifact> follows.
Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — e05fda2
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
One blocking artifact-suppression regression is still present on the current head. The inline comment below covers the remaining sequence where ordinary prose leaves the parser armed and a later real artifact is hidden.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — f230de3
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
One blocking regression remains in the TodoWrite placement path: the canonical card is inserted only after the containing assistant message, so it still drifts from the original in-message TodoWrite position whenever that message has trailing prose or later tool output.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — 984ef8e
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
I found one blocking correctness issue in the new Claude task-to-TodoWrite normalization path. The inline comment below covers the collision and the regression coverage that still needs to be added.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
nettee
left a comment
There was a problem hiding this comment.
I found one remaining blocking correctness issue in the Claude task-to-TodoWrite normalization path. The inline comment below covers the concrete collision and the regression coverage that still needs to be added.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — 546e863
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
I found one remaining blocking regression in the artifact-echo suppression path. The inline comment below covers the concrete truncation case and the regression coverage that still needs to be added.
🔁 Powered by Looper · runner=reviewer · agent=codex · An autonomous AI dev team for your GitHub repos.Generated-By: looper 0.9.7 (runner=fixer, agent=codex)
|
Looper fixer round complete — 8fe1415
🔁 Powered by Looper · runner=fixer · agent=codex · An autonomous AI dev team for your GitHub repos. |
nettee
left a comment
There was a problem hiding this comment.
@Siri-Ray I re-ran the final pass on the current head and the remaining todo/artifact fixes look solid. I verified the daemon parser regressions for Claude/Gemini/Codex TodoWrite handling and artifact-echo suppression, and the web coverage for inline Todo card placement, virtualization, and streaming behavior all pass locally. I also re-ran the underlying tsc steps for apps/web and apps/daemon directly with the pinned toolchain after the packaged daemon typecheck script tripped over a nested pnpm engine mismatch in this environment. Nice cleanup on a tricky cross-runtime bug loop.





































































Why
This fixes regressions found while validating TodoWrite behavior across agent runtimes. The user-facing pain was twofold: Gemini / Claude-style runs could dump a full code artifact into chat after the file was already written, and the TodoWrite UI had drifted from its original in-message position into a composer-pinned card.
The runtime conclusion is that Gemini 3 / preview does not reliably expose native
write_todos, so Open Design should not synthesize TodoWrite state from markdown, TODO files, temporary JSON files, or shell heredocs. We only map standard structured runtime events.What users will see
<artifact>/ full-code block after the actual file-write tool has already produced the file.write_todos; it no longer simulates todos from markdown/files/shell output.Surface area
apps/weborapps/desktop(including Electron menu bar)odsubcommand or flag, newtools-dev/tools-pack/tools-prflag, or newOD_*env var/api/*endpoint, new SSE event, or changed shape inpackages/contractsskills/,design-systems/,design-templates/, orcraft/, or change to the skills protocolTRANSLATIONS.mdfor the locale workflow)package.json(dependenciesordevDependencies); workspace-packagepackage.jsonfiles are out of scope. Include a paragraph on what we get vs. what bytes we ship (seeCONTRIBUTING.md→ Code style)Screenshots
Not attached from this environment. The changed UI behavior is covered by
apps/web/tests/components/chat-todo-autoscroll.test.tsx: no.chat-pinned-todo, one inline Todo card, latest snapshot updates the original card.Bug fix verification
apps/daemon/tests/json-event-stream.test.tsapps/daemon/tests/structured-streams.test.tsapps/web/tests/components/chat-todo-autoscroll.test.tsxmainand green on this branch? No. The local repro was interactive/manual; this PR adds focused regression coverage for the corrected behavior.Validation
corepack pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/json-event-stream.test.ts -t "gemini stream|codex json stream emits TodoWrite"corepack pnpm --filter @open-design/daemon exec vitest run -c vitest.config.ts tests/structured-streams.test.ts -t "Claude"corepack pnpm --filter @open-design/daemon typecheckcorepack pnpm --filter @open-design/web exec vitest run -c vitest.config.ts tests/components/chat-todo-autoscroll.test.tsx tests/components/assistant-message-unfinished-todos.test.tsxcorepack pnpm --filter @open-design/web typecheck