Skip to content

feat: add microphone input, strip markdown, fix system prompt#135

Merged
ComBba merged 1 commit into
mainfrom
feat/mic-input-and-transcript-fix
Feb 26, 2026
Merged

feat: add microphone input, strip markdown, fix system prompt#135
ComBba merged 1 commit into
mainfrom
feat/mic-input-and-transcript-fix

Conversation

@ComBba

@ComBba ComBba commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Microphone input: Add useMicrophone hook that captures browser mic audio (16kHz PCM Int16) and sends it via WebSocket binary to the Go backend → Gemini Live API. Users can now have real voice conversations.
  • Markdown stripping: Add stripMarkdown utility to clean markdown formatting (**bold**, ## headers, etc.) from model transcript before display. Voice apps should show clean text, not raw markdown.
  • System prompt fix: Rewrite onboarding system instruction to prevent the model from exposing internal instructions, narrating its actions ("Initiating Welcome Sequence"), or using markdown. Change greeting trigger from meta-instruction to natural "Hello!".

Changes

File Change
web/hooks/useMicrophone.ts New hook: getUserMedia → ScriptProcessor → Float32→Int16 PCM
web/lib/stripMarkdown.ts New util: regex-based markdown stripping
web/app/page.tsx Integrate mic + stripMarkdown
internal/session/manager.go Improved system prompt with anti-leak rules
internal/handler/websocket.go Natural greeting trigger
Tests 15 new tests (useMicrophone: 5, stripMarkdown: 10)

Local CI

  • Go test -race passed (session, handler packages)
  • Go vet passed
  • Frontend tests passed (10 files, 60 tests)
  • Next.js build passed

Test plan

  • Open the app in browser, click "Begin Reunion"
  • Verify microphone permission is requested
  • Verify red recording indicator appears in top-right corner
  • Speak and verify the AI responds to your voice
  • Verify transcript text appears without markdown formatting
  • Verify AI greeting is natural (not "Initiating Welcome Sequence")

🤖 Generated with Claude Code

Summary by CodeRabbit

릴리스 노트

  • 새로운 기능

    • 마이크 입력 기능 추가 - 음성 녹음 시작/중지 및 활동 표시기
    • 스크립트에서 마크다운 형식 제거로 더 깔끔한 텍스트 표시
    • 시스템 프롬프트 개선으로 더 자연스러운 대화 경험 제공
  • 테스트

    • 마이크 및 마크다운 유틸리티에 대한 포괄적인 테스트 추가

…m prompt

- Add useMicrophone hook: captures mic audio via getUserMedia + ScriptProcessor,
  converts Float32 → Int16 PCM at 16kHz, sends via WebSocket binary
- Add stripMarkdown utility: removes markdown formatting (bold, italic, headers,
  code, links) from model transcript for clean voice-app display
- Improve onboarding system prompt: prevent model from exposing internal
  instructions, remove markdown usage, enforce natural conversational style
- Change greeting trigger from meta-instruction to simple "Hello!" to avoid
  the model narrating its own behavior

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gemini-code-assist

Copy link
Copy Markdown

Summary of Changes

Hello @ComBba, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the user experience by introducing voice interaction capabilities and refining the AI's conversational output. It integrates microphone input for real-time audio communication, ensures AI responses are presented cleanly by stripping markdown, and improves the AI's initial greeting and adherence to conversational guidelines.

Highlights

  • Microphone Input: Implemented a useMicrophone hook to capture browser audio (16kHz PCM Int16) and send it via WebSocket to the Go backend, enabling real-time voice conversations with the Gemini Live API.
  • Markdown Stripping: Introduced a stripMarkdown utility to remove markdown formatting from model transcripts before display, ensuring clean text for voice applications.
  • System Prompt Refinement: Revised the onboarding system instruction to prevent the model from exposing internal directives, narrating actions, or using markdown, and changed the greeting trigger to a natural "Hello!".
Changelog
  • internal/handler/websocket.go
    • Updated the initial greeting trigger sent to the AI from a meta-instruction to a direct "Hello!".
  • internal/session/manager.go
    • Revised the AI's system instruction to prevent internal instruction leakage, action narration, and markdown usage, while emphasizing a warm, natural, and concise conversational style.
  • internal/session/manager_test.go
    • Adjusted the test expectation for the system instruction to check for mentions of "missless" and "welcome" instead of "Korean greeting".
  • web/tests/hooks/useMicrophone.test.ts
    • Added comprehensive unit tests for the useMicrophone hook, covering recording state, microphone access, PCM data conversion, and resource cleanup.
  • web/tests/lib/stripMarkdown.test.ts
    • Added unit tests for the stripMarkdown utility, verifying its ability to remove various markdown elements like bold, italic, headers, links, and blockquotes.
  • web/app/page.tsx
    • Integrated the useMicrophone hook to enable audio input and sending PCM data via WebSocket.
    • Applied the stripMarkdown utility to incoming transcript messages before setting the display state.
    • Added a visual indicator for microphone activity.
  • web/hooks/useMicrophone.ts
    • Implemented the useMicrophone React hook to manage microphone access, capture audio, convert it to 16kHz PCM Int16, and provide start/stop functionality.
  • web/lib/stripMarkdown.ts
    • Created a utility function stripMarkdown that uses regular expressions to remove common markdown formatting from a given text string.
Activity
  • Local CI checks were performed, including Go tests (go test -race), Go vet, frontend tests, and Next.js build.
  • A detailed test plan was provided for manual verification of microphone functionality, markdown stripping, and AI greeting.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai

coderabbitai Bot commented Feb 26, 2026

Copy link
Copy Markdown

Walkthrough

마이크 오디오 캡처 기능을 추가하고 마크다운 형식 제거 유틸리티를 도입했으며, 백엔드의 인사말과 온보딩 시스템 지시사항 텍스트를 업데이트했습니다.

Changes

Cohort / File(s) Summary
백엔드 인사말 및 온보딩 업데이트
internal/handler/websocket.go, internal/session/manager.go, internal/session/manager_test.ts
초기 인사말을 "(The user just connected. Please greet them warmly.)"에서 "Hello!"로 변경하고, BuildOnboardingConfig의 온보딩 시스템 지시사항을 새로운 음성 호스트 스타일 내러티브로 교체했습니다. 지시사항 공개, 내부 상태 및 마크다운 형식 금지 규칙이 추가되었습니다.
마이크 통합 및 마크다운 제거
web/hooks/useMicrophone.ts, web/lib/stripMarkdown.ts
마이크 오디오를 Web Audio API를 통해 캡처하는 새로운 useMicrophone 훅(TARGET_SAMPLE_RATE: 16000, BUFFER_SIZE: 4096)을 추가했습니다. stripMarkdown 유틸리티 함수는 음성 트랜스크립트를 위해 마크다운 서식을 제거합니다.
프론트엔드 홈 페이지 통합
web/app/page.tsx
useMicrophone 훅을 통합하고 WebSocket 연결 후 500ms 지연 시 마이크를 시작하여 PCM 데이터를 서버로 전송합니다. stripMarkdown으로 트랜스크립트를 정규화하고 기록 중 UI 인디케이터를 표시합니다.
테스트 추가
web/__tests__/hooks/useMicrophone.test.ts, web/__tests__/lib/stripMarkdown.test.ts
useMicrophone 훅의 시작, 오디오 데이터 처리, 중지, 안전한 중지 동작을 검증하는 포괄적인 테스트를 추가했습니다. stripMarkdown 함수의 마크다운 형식 제거(굵게, 이탤릭체, 제목, 링크, 목록, 인용 등)를 검증합니다.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 마이크가 노래하고 오디오 흐름이 춤을 춘다,
마크다운은 벗겨지고 음성은 맑아진다,
웹과 백엔드가 손잡고 함께 걷는다,
테스트로 보호하며 새로운 기능이 탄생한다! 🎙️✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목은 변경 사항의 세 가지 주요 부분(마이크 입력, 마크다운 제거, 시스템 프롬프트 수정)을 정확하게 요약하고 있으며, 개발자 관점에서 가장 중요한 변경 사항을 명확하게 나타냅니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/mic-input-and-transcript-fix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces microphone input for voice interaction, a utility to strip markdown from AI responses, and an improved system prompt for a more robust and natural AI. While these features enhance the AI's functionality, a high-severity security issue was identified regarding insecure prompt construction. The application directly embeds potentially untrusted data into system instructions and conversation summaries, creating a significant risk of prompt injection that could allow an attacker to manipulate the AI's behavior or gain unauthorized access. Additionally, the microphone implementation uses a deprecated API, the markdown stripping has a bug with underscores, and WebSocket connection handling could be more reliable. Addressing these security vulnerabilities and functional improvements is critical for stability and safety.

Comment thread web/app/page.tsx
Comment on lines +130 to +135
// Start microphone after a short delay to ensure WebSocket is connected.
setTimeout(() => {
mic.start((pcm) => {
send({ type: 'audio', data: pcm });
});
}, 500);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using setTimeout with a fixed delay to wait for the WebSocket connection is fragile and can lead to race conditions on slower networks. The microphone might start sending data before the connection is established, or the delay might be unnecessarily long.

A more robust approach is to use a useEffect hook to react to the WebSocket connection state. This ensures the microphone is started exactly when the connection becomes available.

Please remove this setTimeout and add the following useEffect to the component:

useEffect(() => {
  if (started && state === 'connected') {
    mic.start((pcm) => {
      send({ type: 'audio', data: pcm });
    });
  }
}, [started, state, mic, send]);

ctxRef.current = ctx;

const source = ctx.createMediaStreamSource(stream);
const processor = ctx.createScriptProcessor(BUFFER_SIZE, 1, 1);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

createScriptProcessor is deprecated and can cause audio glitches because it runs on the main thread. For robust audio processing, you should migrate to AudioWorklet, which runs in a separate thread, preventing UI freezes or audio dropouts.

You would need to create a worklet file (e.g., audio-processor.js) and then use audioContext.audioWorklet.addModule() and new AudioWorkletNode().

Here's an example of what the worklet and the updated hook would look like:

public/audio-processor.js (this file would need to be created)

class AudioProcessor extends AudioWorkletProcessor {
  process(inputs, outputs, parameters) {
    const pcm = inputs[0][0];
    if (!pcm) return true;

    // Convert Float32 [-1, 1] to Int16 [-32768, 32767]
    const int16 = new Int16Array(pcm.length);
    for (let i = 0; i < pcm.length; i++) {
      const s = Math.max(-1, Math.min(1, pcm[i]));
      int16[i] = s < 0 ? s * 0x8000 : s * 0x7fff;
    }
    this.port.postMessage(int16.buffer, [int16.buffer]);
    return true;
  }
}

registerProcessor('audio-processor', AudioProcessor);

Updated useMicrophone.ts start function:

const start = useCallback(async (onData: (pcm: ArrayBuffer) => void) => {
  if (streamRef.current) return;

  onDataRef.current = onData;

  const stream = await navigator.mediaDevices.getUserMedia({ audio: { echoCancellation: true, noiseSuppression: true, sampleRate: TARGET_SAMPLE_RATE } });
  streamRef.current = stream;

  const ctx = new AudioContext({ sampleRate: TARGET_SAMPLE_RATE });
  ctxRef.current = ctx;

  await ctx.audioWorklet.addModule('/audio-processor.js');
  const workletNode = new AudioWorkletNode(ctx, 'audio-processor');
  processorRef.current = workletNode; // Store for cleanup

  workletNode.port.onmessage = (event) => {
    onDataRef.current?.(event.data);
  };

  const source = ctx.createMediaStreamSource(stream);
  source.connect(workletNode);
  workletNode.connect(ctx.destination);
  setIsRecording(true);
}, []);

Note that processorRef would need to be updated to useRef<AudioWorkletNode | null>(null) and the stop function would need to be adjusted to handle the worklet node.

Comment thread web/lib/stripMarkdown.ts
Comment on lines +11 to +16
// Remove bold+italic (***text*** or ___text___)
.replace(/(\*{3}|_{3})(.*?)\1/g, '$2')
// Remove bold (**text** or __text__)
.replace(/(\*{2}|_{2})(.*?)\1/g, '$2')
// Remove italic (*text* or _text_)
.replace(/(\*|_)(.*?)\1/g, '$2')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regular expressions for stripping bold and italic formatting with underscores are too aggressive and can corrupt text that is not markdown. For example, a string like file_name_with_underscores would be incorrectly converted to filenamewith_underscores.

To fix this, the regex for underscores should ensure they are not part of a word (i.e., they are surrounded by whitespace or punctuation, or are at the start/end of the string). This can be achieved using negative lookarounds (?<!\w) and (?!\w).

Suggested change
// Remove bold+italic (***text*** or ___text___)
.replace(/(\*{3}|_{3})(.*?)\1/g, '$2')
// Remove bold (**text** or __text__)
.replace(/(\*{2}|_{2})(.*?)\1/g, '$2')
// Remove italic (*text* or _text_)
.replace(/(\*|_)(.*?)\1/g, '$2')
// Remove bold+italic (***text*** or ___text___), handling underscores carefully
.replace(/\*{3}(.*?)\*{3}|(?<!\w)___(.*?)___(?!\w)/g, '$1$2')
// Remove bold (**text** or __text__)
.replace(/\*{2}(.*?)\*{2}|(?<!\w)__(.*?)__(?!\w)/g, '$1$2')
// Remove italic (*text* or _text_)
.replace(/\*(.*?)\*|(?<!\w)_(.*?)_(?!\w)/g, '$1$2')

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b053d6432

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread web/app/page.tsx
Comment on lines +131 to +135
setTimeout(() => {
mic.start((pcm) => {
send({ type: 'audio', data: pcm });
});
}, 500);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cancel delayed microphone start when stopping session

The delayed setTimeout microphone start is never tracked or cleared, so if the user ends the session within the first 500ms, the callback still runs and calls mic.start(...) after disconnect. In that flow the app can re-open microphone capture after the user explicitly pressed “End Session,” which is a user-facing privacy/resource bug and can leave recording active against a stopped session.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
web/app/page.tsx (1)

119-124: ⚠️ Potential issue | 🟡 Minor

컴포넌트 언마운트 시 마이크 정리 누락

세션 활성 상태에서 컴포넌트가 언마운트되면 마이크가 계속 녹음 상태로 남을 수 있습니다.

🧹 useEffect cleanup에 mic.stop 추가
   // Sync body scroll-lock class with session state; cleanup on unmount.
   useEffect(() => {
     document.body.classList.toggle('session-active', started);
     return () => {
       document.body.classList.remove('session-active');
+      mic.stop();
     };
-  }, [started]);
+  }, [started, mic]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/app/page.tsx` around lines 119 - 124, useEffect currently toggles
document.body class on start but its cleanup only removes the class; ensure any
active microphone is stopped on unmount by calling mic.stop() (or the project's
microphone stop function) inside the cleanup when started is true; update the
useEffect cleanup to call mic.stop() (or stopMicrophone()) before removing
'session-active' so the microphone doesn't remain recording; reference
useEffect, started, and mic.stop to locate and implement the change.
🧹 Nitpick comments (1)
web/__tests__/lib/stripMarkdown.test.ts (1)

25-27: 이미지 마크다운 제거 테스트 케이스 누락

이미지 마크다운(![alt](url))에 대한 테스트가 없습니다. stripMarkdown.ts의 링크/이미지 순서 버그를 발견하는 데 도움이 됩니다.

🧪 이미지 테스트 케이스 추가
   it('strips links preserving text', () => {
     expect(stripMarkdown('[click here](https://example.com)')).toBe('click here');
   });

+  it('strips images preserving alt text', () => {
+    expect(stripMarkdown('![logo](https://example.com/logo.png)')).toBe('logo');
+  });
+
   it('strips bullet list markers', () => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/__tests__/lib/stripMarkdown.test.ts` around lines 25 - 27, Add a unit
test to web/__tests__/lib/stripMarkdown.test.ts that verifies stripMarkdown
correctly handles image markdown; specifically, add an it block (e.g., "strips
image markdown preserving alt text") that calls
stripMarkdown('![alt](https://example.com)') and expects the result toBe('alt'),
referencing the stripMarkdown function to catch the link/image ordering bug.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@web/app/page.tsx`:
- Around line 130-136: The fixed 500ms setTimeout can start the mic before the
WebSocket is ready; instead, wait for the WebSocket open/connected event or a
connection promise before calling mic.start, and only call send({ type: 'audio',
data: pcm }) after confirming the socket is open; on connection failure or close
do not start or immediately stop the mic. Locate the mic.start(...) call and the
send(...) usage and replace the timeout-based start with logic that listens for
the WebSocket instance's 'open' (or a connect promise) and starts mic.start
inside that handler, and ensure cleanup/stop on socket error/close.

In `@web/hooks/useMicrophone.ts`:
- Around line 21-28: Wrap the navigator.mediaDevices.getUserMedia call in a
try-catch inside useMicrophone (the hook containing streamRef) so rejections
(permission denied, no device) are handled; on success set streamRef.current =
stream as before, on failure set/return an explicit error state (e.g., a
returned error value or hook state like microphoneError) and avoid leaving
streamRef undefined, and ensure the hook's public API surfaces that error so
callers (like page.tsx) can display feedback instead of encountering an
unhandled rejection.

In `@web/lib/stripMarkdown.ts`:
- Around line 19-22: The current .replace chain in stripMarkdown leaves a stray
"!" because the link regex (.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')) runs
before the image regex; change the order so the image removal regex
(.replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1')) executes first, then the link regex,
ensuring `![alt](url)` is fully handled; update the .replace sequence inside the
stripMarkdown function accordingly.

---

Outside diff comments:
In `@web/app/page.tsx`:
- Around line 119-124: useEffect currently toggles document.body class on start
but its cleanup only removes the class; ensure any active microphone is stopped
on unmount by calling mic.stop() (or the project's microphone stop function)
inside the cleanup when started is true; update the useEffect cleanup to call
mic.stop() (or stopMicrophone()) before removing 'session-active' so the
microphone doesn't remain recording; reference useEffect, started, and mic.stop
to locate and implement the change.

---

Nitpick comments:
In `@web/__tests__/lib/stripMarkdown.test.ts`:
- Around line 25-27: Add a unit test to web/__tests__/lib/stripMarkdown.test.ts
that verifies stripMarkdown correctly handles image markdown; specifically, add
an it block (e.g., "strips image markdown preserving alt text") that calls
stripMarkdown('![alt](https://example.com)') and expects the result toBe('alt'),
referencing the stripMarkdown function to catch the link/image ordering bug.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e57055 and 8b053d6.

📒 Files selected for processing (8)
  • internal/handler/websocket.go
  • internal/session/manager.go
  • internal/session/manager_test.go
  • web/__tests__/hooks/useMicrophone.test.ts
  • web/__tests__/lib/stripMarkdown.test.ts
  • web/app/page.tsx
  • web/hooks/useMicrophone.ts
  • web/lib/stripMarkdown.ts

Comment thread web/app/page.tsx
Comment on lines +130 to 136
// Start microphone after a short delay to ensure WebSocket is connected.
setTimeout(() => {
mic.start((pcm) => {
send({ type: 'audio', data: pcm });
});
}, 500);
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

WebSocket 연결 전 마이크 시작 경합 조건

500ms 고정 지연은 WebSocket이 연결되었음을 보장하지 않습니다. 네트워크 상태에 따라 연결이 더 오래 걸릴 수 있고, 연결 실패 시에도 마이크가 시작됩니다.

🔧 WebSocket 상태 기반 마이크 시작으로 개선
+  // Start mic when WebSocket connects
+  useEffect(() => {
+    if (started && state === 'connected' && !mic.isRecording) {
+      mic.start((pcm) => {
+        send({ type: 'audio', data: pcm });
+      });
+    }
+  }, [started, state, mic, send]);
+
   const handleStart = () => {
     initAudioContext();
     connect();
     setStarted(true);
-    // Start microphone after a short delay to ensure WebSocket is connected.
-    setTimeout(() => {
-      mic.start((pcm) => {
-        send({ type: 'audio', data: pcm });
-      });
-    }, 500);
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/app/page.tsx` around lines 130 - 136, The fixed 500ms setTimeout can
start the mic before the WebSocket is ready; instead, wait for the WebSocket
open/connected event or a connection promise before calling mic.start, and only
call send({ type: 'audio', data: pcm }) after confirming the socket is open; on
connection failure or close do not start or immediately stop the mic. Locate the
mic.start(...) call and the send(...) usage and replace the timeout-based start
with logic that listens for the WebSocket instance's 'open' (or a connect
promise) and starts mic.start inside that handler, and ensure cleanup/stop on
socket error/close.

Comment on lines +21 to +28
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
echoCancellation: true,
noiseSuppression: true,
sampleRate: TARGET_SAMPLE_RATE,
},
});
streamRef.current = stream;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

getUserMedia 오류 처리 부재

getUserMedia가 거부되면(권한 거부, 마이크 없음 등) 프라미스가 reject되어 호출자에게 처리되지 않은 예외가 전파됩니다. 호출부(page.tsx)에서도 catch하지 않으므로 사용자에게 적절한 피드백을 제공하지 못합니다.

🛡️ try-catch 또는 에러 상태 추가 권장
+  const [error, setError] = useState<string | null>(null);
+
   const start = useCallback(async (onData: (pcm: ArrayBuffer) => void) => {
     if (streamRef.current) return; // already recording

     onDataRef.current = onData;

-    const stream = await navigator.mediaDevices.getUserMedia({
-      audio: {
-        echoCancellation: true,
-        noiseSuppression: true,
-        sampleRate: TARGET_SAMPLE_RATE,
-      },
-    });
+    let stream: MediaStream;
+    try {
+      stream = await navigator.mediaDevices.getUserMedia({
+        audio: {
+          echoCancellation: true,
+          noiseSuppression: true,
+          sampleRate: TARGET_SAMPLE_RATE,
+        },
+      });
+    } catch (err) {
+      setError(err instanceof Error ? err.message : 'Microphone access denied');
+      return;
+    }
+    setError(null);
     streamRef.current = stream;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/hooks/useMicrophone.ts` around lines 21 - 28, Wrap the
navigator.mediaDevices.getUserMedia call in a try-catch inside useMicrophone
(the hook containing streamRef) so rejections (permission denied, no device) are
handled; on success set streamRef.current = stream as before, on failure
set/return an explicit error state (e.g., a returned error value or hook state
like microphoneError) and avoid leaving streamRef undefined, and ensure the
hook's public API surfaces that error so callers (like page.tsx) can display
feedback instead of encountering an unhandled rejection.

Comment thread web/lib/stripMarkdown.ts
Comment on lines +19 to +22
// Remove links [text](url) → text
.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')
// Remove images ![alt](url)
.replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

이미지와 링크 제거 순서 문제

현재 순서에서 링크 regex가 이미지보다 먼저 실행되므로, ![alt](url)에서 [alt](url) 부분만 매칭되어 !가 남게 됩니다.

🐛 이미지 regex를 링크 regex보다 먼저 배치
     // Remove headers (# ... ##)
     .replace(/^#{1,6}\s+/gm, '')
-    // Remove links [text](url) → text
-    .replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')
     // Remove images ![alt](url)
     .replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1')
+    // Remove links [text](url) → text
+    .replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')
     // Remove bullet list markers
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Remove links [text](url) → text
.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')
// Remove images ![alt](url)
.replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1')
// Remove headers (# ... ##)
.replace(/^#{1,6}\s+/gm, '')
// Remove images ![alt](url)
.replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1')
// Remove links [text](url) → text
.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')
// Remove bullet list markers
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@web/lib/stripMarkdown.ts` around lines 19 - 22, The current .replace chain in
stripMarkdown leaves a stray "!" because the link regex
(.replace(/\[([^\]]+)\]\([^)]+\)/g, '$1')) runs before the image regex; change
the order so the image removal regex (.replace(/!\[([^\]]*)\]\([^)]+\)/g, '$1'))
executes first, then the link regex, ensuring `![alt](url)` is fully handled;
update the .replace sequence inside the stripMarkdown function accordingly.

@ComBba ComBba merged commit 4691e64 into main Feb 26, 2026
8 of 9 checks passed
@ComBba ComBba deleted the feat/mic-input-and-transcript-fix branch February 26, 2026 01:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant