Skip to content

feat: add chat UI, status HUD, and audio transcription#137

Merged
ComBba merged 2 commits into
mainfrom
feat/chat-ui-and-hud
Feb 26, 2026
Merged

feat: add chat UI, status HUD, and audio transcription#137
ComBba merged 2 commits into
mainfrom
feat/chat-ui-and-hud

Conversation

@ComBba

@ComBba ComBba commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Enable Gemini Live API audio transcription (InputAudioTranscription + OutputAudioTranscription) for real-time speech-to-text
  • Add ChatPanel component with streaming chat bubbles (model messages left, user messages right)
  • Add StatusHUD (top-left) showing connection status, session state, mic/speaking indicators
  • Add ActionsHUD (top-right) showing available actions per session phase (onboarding vs reunion)
  • Replace old connection indicator and transcript overlay with new HUD components

Local CI

  • go vet passed
  • go test -race passed (15/15 packages)
  • frontend build passed
  • frontend test passed (60/60 tests)

Test plan

  • Verify chat bubbles appear during voice conversation (model on left, user on right)
  • Verify StatusHUD shows correct connection state and mic/speaking indicators
  • Verify ActionsHUD changes between onboarding and reunion actions
  • Verify streaming partial transcription shows with lower opacity, finalized messages full opacity

🤖 Generated with Claude Code

Summary by CodeRabbit

릴리스 노트

  • 새로운 기능
    • 채팅 패널, 상태 표시기(HUD), 작업 메뉴(HUD) 추가로 사용자 인터페이스 개선
    • 스트리밍 채팅 메시지 시스템 구현으로 실시간 메시지 표시 지원
    • 입력 및 출력 오디오 전사(transcription) 기능 추가
    • 메시지 완료 상태 추적 기능으로 부분 및 완전한 메시지 구분

- Enable InputAudioTranscription and OutputAudioTranscription in Live API
  configs for both onboarding and reunion sessions
- Forward user/model speech transcripts from Go backend to browser
- Add ChatPanel component with streaming message bubbles (model left, user right)
- Add StatusHUD component (top-left) showing connection, session state, mic/speaking
- Add ActionsHUD component (top-right) showing available actions per session state
- Replace old connection indicator and transcript overlay with new HUD components
- Handle partial/streaming transcription with pending message accumulation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello @ComBba, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the user experience for real-time voice conversations by integrating advanced UI components and enabling comprehensive audio transcription. It transitions from a basic transcript overlay to a dynamic chat interface and introduces dedicated heads-up displays for connection status, session state, and available actions, making the interaction more intuitive and informative for users.

Highlights

  • Audio Transcription Integration: Enabled real-time audio transcription for both user input and model output via the Gemini Live API, allowing for speech-to-text functionality.
  • Chat UI Implementation: Introduced a new ChatPanel component to display streaming chat bubbles, clearly differentiating between user and model messages with distinct styling and smooth scrolling.
  • Status and Actions Heads-Up Displays (HUDs): Implemented a StatusHUD component to show connection status, session state, and microphone/speaking indicators, and an ActionsHUD component to dynamically display available actions based on the current session phase (onboarding or reunion).
  • UI Modernization: Replaced the previous connection indicator and basic transcript overlay with the new StatusHUD and ChatPanel components for a more integrated and informative user interface.
Changelog
  • internal/live/proxy.go
    • Added logic to forward user input and model output transcriptions as WebSocket messages, including a 'finished' flag for partial vs. final text.
  • internal/session/manager.go
    • Configured InputAudioTranscription and OutputAudioTranscription for both onboarding and reunion session configurations to enable real-time speech processing.
  • internal/session/manager_test.go
    • Added test assertions to verify that audio transcription configurations are correctly enabled for both onboarding and reunion session setups.
  • web/app/page.tsx
    • Removed the old CONNECTION_COLORS constant and transcript state, and introduced chatMessages state, pendingMsgRef, and msgIdRef for managing chat messages.
    • Updated the handleMessage function to process incoming transcript messages, accumulating partial texts and finalizing messages for display in the ChatPanel.
    • Modified the resetState function to clear chat messages and pending message references upon state reset.
    • Removed the old connection indicator and transcript overlay JSX from the main page component.
    • Integrated the new StatusHUD, ActionsHUD, and ChatPanel components into the main application layout.
  • web/components/ActionsHUD.tsx
    • Added a new React component to display context-sensitive actions (e.g., 'Talk', 'Share Video') based on the current sessionState.
  • web/components/ChatPanel.tsx
    • Added a new React component to render streaming chat messages, featuring distinct styling for user and model roles, and smooth scrolling to the latest message.
  • web/components/StatusHUD.tsx
    • Added a new React component to display real-time connection status, microphone activity, speaking indicators, and the current session state.
  • web/hooks/useWebSocket.ts
    • Updated the ServerMessage type definition to include an optional finished boolean property for transcript messages, indicating whether a transcription is complete or partial.
Activity
  • Local CI checks passed for Go vet, Go test with race detection, frontend build, and frontend tests.
  • A detailed test plan was provided to verify the functionality of chat bubbles, StatusHUD, ActionsHUD, and streaming transcription behavior.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai

coderabbitai Bot commented Feb 26, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@ComBba has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 17 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 70f963a and 840b50d.

📒 Files selected for processing (2)
  • internal/live/proxy.go
  • web/app/page.tsx

Walkthrough

오디오 트랜스크립션 처리를 위한 백엔드 지원이 추가되었으며, 프론트엔드에서는 채팅 메시지 스트리밍 시스템으로 트랜스크립트를 처리합니다. 새로운 UI 컴포넌트(ChatPanel, StatusHUD, ActionsHUD)가 도입되어 사용자 인터페이스가 개선되었습니다.

Changes

Cohort / File(s) Summary
백엔드 트랜스크립션 처리
internal/live/proxy.go
입력 및 출력 트랜스크립션 메시지를 JSON으로 변환하여 finished 플래그와 함께 브라우저로 전달합니다.
세션 설정 확장
internal/session/manager.go, internal/session/manager_test.go
오보딩 및 재결합 구성에 InputAudioTranscriptionOutputAudioTranscription 필드를 추가하고 테스트에서 nil 체크를 확인합니다.
WebSocket 메시지 타입 업데이트
web/hooks/useWebSocket.ts
트랜스크립트 서버 메시지에 선택적 finished 부울 속성을 추가합니다.
채팅 UI 컴포넌트
web/components/ChatPanel.tsx, web/components/StatusHUD.tsx, web/components/ActionsHUD.tsx
메시지 렌더링, 연결 상태 표시, 사용 가능 작업 목록을 위한 새로운 UI 컴포넌트 세트입니다.
페이지 통합
web/app/page.tsx
트랜스크립트를 스트리밍 채팅 메시지로 변환하고 새로운 HUD 컴포넌트를 통합하며 ChatPanel을 렌더링합니다.

Sequence Diagram(s)

sequenceDiagram
    participant Backend as Backend<br/>(proxy.go)
    participant WebSocket as WebSocket
    participant Page as Page<br/>(page.tsx)
    participant ChatPanel as ChatPanel<br/>Component

    Backend->>WebSocket: 트랜스크립션 메시지<br/>(InputTranscription/OutputTranscription)
    WebSocket->>Page: ServerMessage<br/>type: 'transcript'<br/>finished: boolean
    Page->>Page: chatMessages 상태 업데이트<br/>부분 텍스트 누적
    Page->>Page: finished=true일 때<br/>완성된 메시지 생성
    Page->>ChatPanel: chatMessages 전달
    ChatPanel->>ChatPanel: 역할별로<br/>메시지 렌더링
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 프록시를 통해 목소리 흘러내리고
채팅 거품 중얼거리며 둥둥
HUD들이 반짝반짝 춤을 추고
메시지는 스트림이 되어 흘러가니
트랜스크립션의 마법 피어나~

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically summarizes the main changes: adding chat UI, status HUD, and audio transcription support across both backend and frontend.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/chat-ui-and-hud

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant UI enhancements by adding a real-time chat panel, a status HUD, and an actions HUD. The backend is updated to enable and forward audio transcriptions from the Gemini Live API, which powers the new chat interface. The frontend logic for handling streaming and finalized transcript messages is well-implemented, though there is a small bug in the state update logic that could leave stale messages on the screen. Additionally, on the backend, there appears to be a duplication in how model transcripts are sent to the client, which could result in duplicate messages. My review includes suggestions to address these two high-severity issues. Overall, this is a great feature addition that dramatically improves the user experience.

Comment thread internal/live/proxy.go
Comment on lines +301 to +308
if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
p.sendJSON(map[string]any{
"type": "transcript",
"role": "model",
"text": content.OutputTranscription.Text,
"finished": content.OutputTranscription.Finished,
})
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This new block for forwarding the model's output transcription appears to duplicate existing logic. The loop over content.ModelTurn.Parts on lines 271-286 already sends a transcript message with the model's text. This will likely result in duplicate chat messages being displayed in the UI. Since this new OutputTranscription path correctly provides the finished flag, which the new UI logic relies on, the sendJSON call within the ModelTurn loop should probably be removed to resolve the duplication.

Comment thread web/app/page.tsx
Comment on lines +66 to +81
if (finished) {
// Finalize: flush pending partial text into a completed message.
const pending = pendingMsgRef.current[role];
const finalText = pending ? pending + text : text;
if (finalText) {
const id = String(msgIdRef.current++);
setChatMessages((prev) => {
// Remove the in-progress placeholder for this role if present.
const cleaned = prev.filter(
(m) => !(m.role === role && !m.finished),
);
return [...cleaned, { id, role, text: finalText, finished: true }];
});
}
pendingMsgRef.current[role] = null;
} else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a potential bug in how finished transcript messages are handled. If a finished: true message arrives and the resulting finalText is empty, the if (finalText) condition prevents setChatMessages from being called. This means an in-progress message for that role could get stuck on the screen, as it's never cleared. The logic should be restructured to ensure the pending message is always removed when a finished message is processed, regardless of whether the final text is empty.

        if (finished) {
          // Finalize: flush pending partial text into a completed message.
          const pending = pendingMsgRef.current[role];
          const finalText = pending ? pending + text : text;
          if (finalText) {
            const id = String(msgIdRef.current++);
            setChatMessages((prev) => {
              // Remove the in-progress placeholder for this role if present.
              const cleaned = prev.filter(
                (m) => !(m.role === role && !m.finished),
              );
              return [...cleaned, { id, role, text: finalText, finished: true }];
            });
          } else {
            // If final text is empty, just remove the pending message from the UI.
            setChatMessages((prev) =>
              prev.filter((m) => !(m.role === role && !m.finished)),
            );
          }
          pendingMsgRef.current[role] = null;
        }

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70f963ad2d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/live/proxy.go
Comment on lines +301 to +306
if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
p.sendJSON(map[string]any{
"type": "transcript",
"role": "model",
"text": content.OutputTranscription.Text,
"finished": content.OutputTranscription.Finished,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Stop sending duplicate model transcripts to the client

This new branch emits a second transcript stream for role: "model" even though handleServerContent already emits model text from ModelTurn.Parts above. When OutputAudioTranscription is enabled, Live messages can include both sources for the same utterance, and web/app/page.tsx currently merges chunks by role into a single pending message, which leads to duplicated/garbled chat text in the HUD. Use one canonical model transcript source (or a distinct message type) so the frontend does not interleave two model streams.

Useful? React with 👍 / 👎.

Comment thread internal/live/proxy.go
Comment on lines +290 to +292
if content.InputTranscription != nil && content.InputTranscription.Text != "" {
p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
p.sendJSON(map[string]any{

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Persist user transcription only when chunk is finished

AddTranscript("user", ...) runs for every InputTranscription update, including partial chunks (Finished == false). Because analyze_user uses only a bounded recent transcript buffer, long in-progress utterances can fill the buffer with fragments and evict real prior turns, degrading analysis quality. Coalesce partial input transcription updates and persist only completed user turns to the transcript store.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/hooks/useWebSocket.ts (1)

29-37: ⚠️ Potential issue | 🟠 Major

연결 종료 후 자동 재연결이 없어 세션 복원성이 떨어집니다.

onclose에서 상태만 disconnected로 바꾸고 종료되어, 일시적 네트워크 단절 후 사용자가 수동으로 다시 시작해야 합니다.

🔁 제안 패치
 export function useWebSocket(url: string, onMessage: (msg: ServerMessage) => void) {
   const wsRef = useRef<WebSocket | null>(null);
+  const reconnectTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null);
+  const shouldReconnectRef = useRef(true);
   const [state, setState] = useState<ConnectionState>('disconnected');

   const connect = useCallback(() => {
     setState('connecting');
     const ws = new WebSocket(url);
     ws.binaryType = 'arraybuffer';

     ws.onopen = () => setState('connected');
-    ws.onclose = () => setState('disconnected');
+    ws.onclose = () => {
+      setState('disconnected');
+      if (shouldReconnectRef.current) {
+        reconnectTimerRef.current = setTimeout(() => connect(), 1000);
+      }
+    };
     ws.onerror = () => setState('error');
@@
   const disconnect = useCallback(() => {
+    shouldReconnectRef.current = false;
+    if (reconnectTimerRef.current) {
+      clearTimeout(reconnectTimerRef.current);
+      reconnectTimerRef.current = null;
+    }
     wsRef.current?.close();
     wsRef.current = null;
   }, []);

As per coding guidelines, "Manage WebSocket connection lifecycle in useWebSocket hook with reconnection logic".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/hooks/useWebSocket.ts` around lines 29 - 37, The WebSocket currently sets
ws.onclose and ws.onerror to only update state, so add automatic reconnection
logic inside the connect function: when ws.onclose or ws.onerror fires, set
state appropriately (e.g., 'disconnected' or 'error') and schedule a reconnect
attempt using exponential backoff (or fixed retry interval) while preserving the
current url and hook lifecycle; ensure you clear pending timers when unmounting
and avoid multiple concurrent connections by cancelling previous reconnect
attempts before creating a new WebSocket. Reference the connect function,
ws.onclose, ws.onerror, setState, and url to locate where to implement the
reconnection/backoff and cleanup logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/live/proxy.go`:
- Around line 289-308: Avoid duplicating and accumulating partial transcripts by
only forwarding and adding to tool context once the transcript is final: change
the logic around AddTranscript and sendJSON for
content.InputTranscription/part.Text and content.OutputTranscription so that
partial (unfinished) segments are not passed to p.toolHandler.AddTranscript, and
ensure you do not emit the same model utterance twice when both part.Text and
content.OutputTranscription.Text are present (prefer the final
OutputTranscription when Finished==true or dedupe by skipping part.Text if
OutputTranscription exists). In short, gate AddTranscript and sendJSON on the
Finished flag and add a check so model transcripts from part.Text are suppressed
when content.OutputTranscription is present to prevent duplicate render/context
pollution.

---

Outside diff comments:
In `@web/hooks/useWebSocket.ts`:
- Around line 29-37: The WebSocket currently sets ws.onclose and ws.onerror to
only update state, so add automatic reconnection logic inside the connect
function: when ws.onclose or ws.onerror fires, set state appropriately (e.g.,
'disconnected' or 'error') and schedule a reconnect attempt using exponential
backoff (or fixed retry interval) while preserving the current url and hook
lifecycle; ensure you clear pending timers when unmounting and avoid multiple
concurrent connections by cancelling previous reconnect attempts before creating
a new WebSocket. Reference the connect function, ws.onclose, ws.onerror,
setState, and url to locate where to implement the reconnection/backoff and
cleanup logic.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89088ce and 70f963a.

📒 Files selected for processing (8)
  • internal/live/proxy.go
  • internal/session/manager.go
  • internal/session/manager_test.go
  • web/app/page.tsx
  • web/components/ActionsHUD.tsx
  • web/components/ChatPanel.tsx
  • web/components/StatusHUD.tsx
  • web/hooks/useWebSocket.ts

Comment thread internal/live/proxy.go
Comment on lines +289 to +308
// Forward input transcription (what the user said).
if content.InputTranscription != nil && content.InputTranscription.Text != "" {
p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
p.sendJSON(map[string]any{
"type": "transcript",
"role": "user",
"text": content.InputTranscription.Text,
"finished": content.InputTranscription.Finished,
})
}

// Forward output transcription (what the model said, as text).
if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
p.sendJSON(map[string]any{
"type": "transcript",
"role": "model",
"text": content.OutputTranscription.Text,
"finished": content.OutputTranscription.Finished,
})
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

전사 이벤트를 이중/부분 누적으로 보내서 채팅 중복과 컨텍스트 오염이 생길 수 있습니다.

Line 276 경로(part.Text)와 Line 301 경로(OutputTranscription.Text)가 동시에 model transcript를 내보내면 동일 발화가 중복 렌더링됩니다. 또한 Line 291은 finished 이전 partial도 AddTranscript에 넣어 tool 문맥이 불필요하게 부풀 수 있습니다.

🧩 제안 패치
-  // Forward input transcription (what the user said).
+  // Forward input transcription (what the user said).
   if content.InputTranscription != nil && content.InputTranscription.Text != "" {
-    p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
+    if content.InputTranscription.Finished {
+      p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
+    }
     p.sendJSON(map[string]any{
       "type":     "transcript",
       "role":     "user",
       "text":     content.InputTranscription.Text,
       "finished": content.InputTranscription.Finished,
     })
   }

-  // Forward output transcription (what the model said, as text).
-  if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
+  // Forward output transcription (what the model said, as text).
+  // Prefer a single model transcript source to avoid duplicates with ModelTurn.Part.Text.
+  if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
+    p.toolHandler.AddTranscript("model", content.OutputTranscription.Text)
     p.sendJSON(map[string]any{
       "type":     "transcript",
       "role":     "model",
       "text":     content.OutputTranscription.Text,
       "finished": content.OutputTranscription.Finished,
     })
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Forward input transcription (what the user said).
if content.InputTranscription != nil && content.InputTranscription.Text != "" {
p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
p.sendJSON(map[string]any{
"type": "transcript",
"role": "user",
"text": content.InputTranscription.Text,
"finished": content.InputTranscription.Finished,
})
}
// Forward output transcription (what the model said, as text).
if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
p.sendJSON(map[string]any{
"type": "transcript",
"role": "model",
"text": content.OutputTranscription.Text,
"finished": content.OutputTranscription.Finished,
})
}
// Forward input transcription (what the user said).
if content.InputTranscription != nil && content.InputTranscription.Text != "" {
if content.InputTranscription.Finished {
p.toolHandler.AddTranscript("user", content.InputTranscription.Text)
}
p.sendJSON(map[string]any{
"type": "transcript",
"role": "user",
"text": content.InputTranscription.Text,
"finished": content.InputTranscription.Finished,
})
}
// Forward output transcription (what the model said, as text).
// Prefer a single model transcript source to avoid duplicates with ModelTurn.Part.Text.
if content.OutputTranscription != nil && content.OutputTranscription.Text != "" {
p.toolHandler.AddTranscript("model", content.OutputTranscription.Text)
p.sendJSON(map[string]any{
"type": "transcript",
"role": "model",
"text": content.OutputTranscription.Text,
"finished": content.OutputTranscription.Finished,
})
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/live/proxy.go` around lines 289 - 308, Avoid duplicating and
accumulating partial transcripts by only forwarding and adding to tool context
once the transcript is final: change the logic around AddTranscript and sendJSON
for content.InputTranscription/part.Text and content.OutputTranscription so that
partial (unfinished) segments are not passed to p.toolHandler.AddTranscript, and
ensure you do not emit the same model utterance twice when both part.Text and
content.OutputTranscription.Text are present (prefer the final
OutputTranscription when Finished==true or dedupe by skipping part.Text if
OutputTranscription exists). In short, gate AddTranscript and sendJSON on the
Finished flag and add a check so model transcripts from part.Text are suppressed
when content.OutputTranscription is present to prevent duplicate render/context
pollution.

…Transcript

- Remove model text forwarding from part.Text (use OutputTranscription only)
  to prevent duplicate chat messages in browser
- Only call AddTranscript("user") when InputTranscription.Finished is true
  to avoid bloating tool context with partial speech chunks
- Handle empty finalized transcript by cleaning up placeholder messages

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 840b50d255

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/live/proxy.go
}

// Forward input transcription (what the user said).
if content.InputTranscription != nil && content.InputTranscription.Text != "" {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Forward empty finished transcription chunks to clients

handleServerContent drops transcription updates whenever Text is empty, so a Finished=true terminal chunk with no text is never forwarded. In that case the frontend never receives the finalize signal it needs to clear/commit the pending bubble (it already has explicit empty-finalize handling), and the backend also skips AddTranscript("user", ...) because that is only executed on finished events inside this same non-empty guard. This can leave stale/concatenated chat turns and lose user utterances from tool context for analyze_user.

Useful? React with 👍 / 👎.

@ComBba ComBba merged commit b38f172 into main Feb 26, 2026
8 of 9 checks passed
@ComBba ComBba deleted the feat/chat-ui-and-hud branch February 26, 2026 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant