feat(responses): route /v1/responses through non-OpenAI adaptors via chat completions conversion by 0-don · Pull Request #2970 · QuantumNous/new-api

0-don · 2026-02-18T21:12:23Z

Summary

Routes /v1/responses requests through non-OpenAI adaptors (starting with Claude) by converting Responses → Chat Completions → Native format, then converting responses back. Reuses the existing adaptor pipeline instead of duplicating conversion logic per-adaptor.

Why not use existing PRs directly?

vs #2892 (494 lines, Claude-only)

Adds 250-line RequestOpenAIResponses2ClaudeMessage that reimplements all the logic already in RequestOpenAI2ClaudeMessage (max tokens, thinking, tools, instructions, tool choice)
DoResponsesRequest does naive SSE passthrough (scanner.Scan + fmt.Fprintf) — pipes raw Claude content_block_delta events without converting to Responses API format (response.output_text.delta, response.created, etc.). Clients expecting Responses SSE won't work
ResponseClaude2OpenAIResponses duplicates non-streaming conversion already handled by ClaudeHandler + ResponseClaude2OpenAI
Only works for Claude — no reusable converter for other adaptors
Adopted from feat(claude): Support calling Claude channels via /v1/responses #2892: the 2-line bug fix for tool calls being dropped when text content exists

vs #2817 (1039 lines)

Similar adaptor-level approach (good), but includes its own ChatToResponsesStreamState and converter
Our stream state machine handles additional cases: reasoning content, usage chunks, SummaryIndex field
Adopted from feat: enhance OpenAI responses handling and compatibility #2817: adaptor-level ConvertOpenAIResponsesRequest pattern, GetJsonType() for input detection, developer → system role mapping, PromptCacheKey/PromptCacheRetention passthrough, non-function tool passthrough via Custom field

Our approach (1006 lines, adaptor-agnostic)

Claude adaptor: 3 lines — chains ResponsesRequestToChatCompletionsRequest → ConvertOpenAIRequest (reuses all existing Claude conversion logic)
ChatToResponsesStreamState: reusable state machine any adaptor can use — properly emits response.created, response.output_text.delta, response.function_call_arguments.delta, response.completed etc.
ResponsesRequestToChatCompletionsRequest: generic converter with function_call/function_call_output merging, content type handling, response format conversion — works for any adaptor, not just Claude
Any future adaptor (Gemini, Deepseek, etc.) gets Responses API support by adding the same 3-line pattern

Related issues

格式转换 /v1/responses -> /v1/chat/completions #2941 — Requests /v1/responses → /v1/chat/completions conversion for unsupported channels (this implements it)
支持对不支持/v1/responses的渠道进行格式转换 #2043 — Requests format conversion for channels not supporting /v1/responses
无法在 NewAPI 中使用 OpenAI 推理模型 o3-pro：测试工具/网关强制走 chat.completions 或不支持 Responses API 的 input 参数 #1527 — Cannot use responses-only models through non-OpenAI channels

Files changed

File	Change
`relay/channel/claude/adaptor.go`	Implement `ConvertOpenAIResponsesRequest` (+5 lines)
`relay/channel/claude/relay-claude.go`	Add `RelayFormatOpenAIResponses` to stream/non-stream handlers
`service/openaicompat/responses_to_chat.go`	Add converters, bug fix from #2892
`service/openaicompat/chat_stream_to_responses_stream.go`	New: reusable stream state machine
`service/openai_chat_responses_compat.go`	Service wrapper functions
`dto/openai_response.go`	Add fields to `ResponsesStreamResponse`

Test plan

Streaming /v1/responses through Claude channel — verify SSE events
Non-streaming /v1/responses through Claude channel
Tool calls with text content (both appear in output)
Multi-turn with function_call + function_call_output in input

Summary by CodeRabbit

New Features
- Added OpenAI Responses API support with full streaming integration
- Enabled automatic conversion between Chat Completions and Responses API formats
- Extended Claude relay compatibility to support the Responses API format

coderabbitai · 2026-02-18T21:12:45Z

Walkthrough

Adds OpenAI Responses API support and streaming between Claude and OpenAI-compatible Responses: updates DTOs, introduces a chat->responses streaming state machine, adds bidirectional conversion functions, integrates conversion in the Claude relay adaptor/streaming handlers.

Changes

Cohort / File(s)	Summary
DTO Updates `dto/openai_response.go`	Modified `ResponsesStreamResponse`: added `ResponseID`, `Text`, `Arguments`; changed `Part` to `ResponsesOutputContent`; added top-level `GetOpenAIError(errorField any) types.OpenAIError`.
Claude Relay `relay/channel/claude/adaptor.go`, `relay/channel/claude/relay-claude.go`	Added conversion of OpenAI Responses requests to chat requests in adaptor; integrated Responses streaming path in relay-claude (new handling for `RelayFormatOpenAIResponses`, added `ResponsesStreamState` field and streaming handlers).
Service wrappers `service/openai_chat_responses_compat.go`	Exported wrapper functions: `ResponsesRequestToChatCompletionsRequest` and `ChatCompletionsResponseToResponsesResponse` delegating to openaicompat implementations.
Chat->Responses streaming `service/openaicompat/chat_stream_to_responses_stream.go`	New `ChatToResponsesStreamState` type and methods to convert chat-completions stream chunks into OpenAI Responses SSE events; handles lifecycle, content parts, tool calls, indexing, and final assembly.
Bidirectional conversion logic `service/openaicompat/responses_to_chat.go`	Implemented `ResponsesRequestToChatCompletionsRequest`, `ChatCompletionsResponseToResponsesResponse`, content/response-format converters and helper functions to translate between Responses API and Chat Completions formats (supports mixed content and tool-call merging).

Sequence Diagram

sequenceDiagram
    actor Client
    participant RelayAdaptor as Relay Adaptor
    participant ConversionSvc as Conversion Service
    participant Claude as Claude Channel
    participant RelayStream as Relay Stream Handler
    participant StateMgr as Responses Stream State

    Client->>RelayAdaptor: OpenAI Responses request
    RelayAdaptor->>ConversionSvc: ResponsesRequestToChatCompletionsRequest
    ConversionSvc-->>RelayAdaptor: Chat Completions request
    RelayAdaptor->>Claude: Forward chat request

    Claude->>RelayStream: Streamed chat-completions chunks
    RelayStream->>StateMgr: HandleChatChunk(chunk)
    StateMgr-->>RelayStream: []ResponsesStreamResponse events
    RelayStream-->>Client: Stream response chunks

    Note over Claude,RelayStream: Repeat for streaming chunks

    Claude->>RelayStream: Final chat response
    RelayStream->>ConversionSvc: ChatCompletionsResponseToResponsesResponse
    ConversionSvc-->>RelayStream: Final Responses payload
    RelayStream-->>Client: Final response.completed event

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

fix: map Responses reasoning stream to chat completion deltas #2837: Modifies ResponsesStreamResponse fields and Part/indices — closely related DTO changes.
feat: /v1/messages -> /v1/responses #2889: Adds/changes relay streaming and conversion paths between Responses and Chat — overlaps streaming integration.
fix: /v1/chat/completions -> /v1/responses json_schema #2876: Handles response-format/json_schema conversion logic between chat and Responses — related conversion logic.

Suggested reviewers

creamlike1024
Calcium-Ion

Poem

🐇 I hop through streams both near and far,
Turning chat to responses, bit by star,
IDs aligned and chunks set free,
Claude and OpenAI dance with me,
A rabbit's cheer for code that sings 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main objective: routing /v1/responses requests through non-OpenAI adaptors by converting to/from chat completions format.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

service/openaicompat/chat_stream_to_responses_stream.go (1)
83-96: Reasoning summary events use hardcoded indices without a corresponding output item.

The output_index: 0 and summary_index: 0 are hardcoded, and no response.output_item.added event is emitted for a reasoning item. If a text message is already at output index 0, clients may find the overlap confusing. This is acceptable as a first iteration for the compatibility layer, but consider tracking reasoning as a proper output item in a follow-up if the OpenAI Responses API spec requires it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@service/openaicompat/chat_stream_to_responses_stream.go` around lines 83 -
96, The reasoning summary currently uses hardcoded OutputIndex/SummaryIndex (0)
in the dto.ResponsesStreamResponse built from reasoningContent; instead emit a
corresponding response.output_item.added event for the reasoning item before
appending the "response.reasoning_summary_text.delta" event, compute a proper
outputIndex (e.g. nextOutputIndex based on current outputs or a counter) and use
that index for both OutputIndex and SummaryIndex when constructing the
dto.ResponsesStreamResponse, and ensure the ItemID (currently
"rs_"+strings.TrimPrefix(s.ResponseID,"resp_")) and s.ResponseID are consistent
with the added output item so clients won't see an index collision.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@relay/channel/claude/relay-claude.go`:
- Line 856: Replace the direct use of encoding/json by calling the project's
JSON wrapper: change the call that marshals responsesResp into responseData from
json.Marshal(responsesResp) to common.Marshal(responsesResp), and update imports
to remove encoding/json and import the common package (the wrapper defined in
common/json.go); keep existing error handling around the returned err and ensure
the variable names responseData and responsesResp remain unchanged.

In `@service/openaicompat/responses_to_chat.go`:
- Around line 367-381: In the switch handling media parts in
responses_to_chat.go, change the incorrect key lookups so media isn't dropped:
in the "input_file" case populate dto.MediaContent.File/FileData/FileID from
partMap["file_id"], partMap["file_url"] or partMap["file_data"] (pick available
field) and set Type dto.ContentTypeFile; for the non-standard "input_video" case
either remove it or map it defensively by checking for partMap["file_url"] or
partMap["file_id"] and only create dto.MediaContent with Type
dto.ContentTypeVideoUrl when a valid URL/ID is present; ensure you reference the
same variables used there (partMap, chatParts, dto.MediaContent,
dto.ContentTypeFile, dto.ContentTypeVideoUrl) and skip/return no media if none
of the expected keys exist.

---

Nitpick comments:
In `@service/openaicompat/chat_stream_to_responses_stream.go`:
- Around line 83-96: The reasoning summary currently uses hardcoded
OutputIndex/SummaryIndex (0) in the dto.ResponsesStreamResponse built from
reasoningContent; instead emit a corresponding response.output_item.added event
for the reasoning item before appending the
"response.reasoning_summary_text.delta" event, compute a proper outputIndex
(e.g. nextOutputIndex based on current outputs or a counter) and use that index
for both OutputIndex and SummaryIndex when constructing the
dto.ResponsesStreamResponse, and ensure the ItemID (currently
"rs_"+strings.TrimPrefix(s.ResponseID,"resp_")) and s.ResponseID are consistent
with the added output item so clients won't see an index collision.

relay/channel/claude/relay-claude.go

coderabbitai · 2026-02-18T21:22:40Z

service/openaicompat/responses_to_chat.go

+			case "input_audio":
+				chatParts = append(chatParts, dto.MediaContent{
+					Type:       dto.ContentTypeInputAudio,
+					InputAudio: partMap["input_audio"],
+				})
+			case "input_file":
+				chatParts = append(chatParts, dto.MediaContent{
+					Type: dto.ContentTypeFile,
+					File: partMap["file"],
+				})
+			case "input_video":
+				chatParts = append(chatParts, dto.MediaContent{
+					Type:     dto.ContentTypeVideoUrl,
+					VideoUrl: partMap["video_url"],
+				})


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n service/openaicompat/responses_to_chat.go | sed -n '360,390p'

Repository: QuantumNous/new-api

Length of output: 1052

🏁 Script executed:

cat -n service/openaicompat/responses_to_chat.go | sed -n '330,400p'

Repository: QuantumNous/new-api

Length of output: 2532

🏁 Script executed:

head -50 service/openaicompat/responses_to_chat.go

Repository: QuantumNous/new-api

Length of output: 1574

🌐 Web query:

OpenAI Responses API input_audio input_file content part schema specification

💡 Result:

input_audio content part (Chat Completions API)

In a messages[].content[] array, an audio input is:

{ "type": "input_audio", "input_audio": { "data": "<base64-encoded-audio-bytes>", "format": "wav" // or "mp3" } }

type: must be "input_audio"

input_audio.data: base64-encoded audio

input_audio.format: "wav" or "mp3"[1]

input_file content part (Responses API)

In input: [{ "role": "...", "content": [ ... ] }], a file input item is:

{ "type": "input_file", "file_id": "file_..." // OR // "file_url": "https://..." // OR // "filename": "doc.pdf", // "file_data": "<base64-encoded-bytes>" }

type: must be "input_file"

Provide one of:

file_id (previously uploaded file), or

file_url, or

filename + file_data (base64)[2][3]

Note: In Chat Completions, the analogous file content part is {"type":"file","file":{...}} (not input_file).[1]

Sources: [1] Chat Completions create schema (content parts, incl. input_audio, file) • [2] Responses create schema (ResponseInputFile) • [3] File inputs guide (examples using type: "input_file")

input_file and input_video content mappings read non-existent keys, silently dropping media content.

The Responses API input_file type uses file_id, file_url, or file_data (not "file"), and input_video is not a standard Responses API content type. Line 375 reads partMap["file"] which returns nil, and line 380 reads partMap["video_url"] for a non-standard type, both resulting in incomplete media content conversion.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@service/openaicompat/responses_to_chat.go` around lines 367 - 381, In the switch handling media parts in responses_to_chat.go, change the incorrect key lookups so media isn't dropped: in the "input_file" case populate dto.MediaContent.File/FileData/FileID from partMap["file_id"], partMap["file_url"] or partMap["file_data"] (pick available field) and set Type dto.ContentTypeFile; for the non-standard "input_video" case either remove it or map it defensively by checking for partMap["file_url"] or partMap["file_id"] and only create dto.MediaContent with Type dto.ContentTypeVideoUrl when a valid URL/ID is present; ensure you reference the same variables used there (partMap, chatParts, dto.MediaContent, dto.ContentTypeFile, dto.ContentTypeVideoUrl) and skip/return no media if none of the expected keys exist.

…chat completions conversion Enables channels that don't natively support the OpenAI Responses API (e.g. Claude/Anthropic) to serve /v1/responses requests by converting them to /v1/chat/completions format, routing through the existing adaptor pipeline, and converting the response back. Changes: - Implement ConvertOpenAIResponsesRequest in Claude adaptor using Responses→Chat→Claude conversion chain (reuses existing code) - Add ChatToResponsesStreamState for converting chat completion stream chunks into Responses API SSE events (handles text deltas, tool calls, reasoning content, usage) - Add RelayFormatOpenAIResponses support to Claude stream/non-stream handlers in relay-claude.go - Add ResponsesRequestToChatCompletionsRequest converter with: - GetJsonType() for robust input type detection - developer role → system mapping - PromptCacheKey/PromptCacheRetention passthrough - Non-function tool passthrough via Custom field - function_call/function_call_output → tool message merging - Add ChatCompletionsResponseToResponsesResponse for non-streaming - Fix: tool calls no longer dropped when text content exists in ResponsesResponseToChatCompletionsResponse - Update ResponsesStreamResponse DTO with ResponseID, Text, Arguments, ContentIndex, Part fields Combines and improves on approaches from: - QuantumNous#2817 (adaptor-level conversion, converter improvements) - QuantumNous#2892 (tool calls + text coexistence bug fix) Related: QuantumNous#2941, QuantumNous#2043, QuantumNous#1527

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

service/openaicompat/responses_to_chat.go (1)
383-394: default case silently produces incomplete content parts for unknown types.

For any unrecognised Responses API content part type, the current code appends a MediaContent{Type: partType} with no payload (no Text, no ImageUrl, no InputAudio, no File). When serialised this becomes {"type":"<unknown>"}, which will either confuse downstream models or be silently ignored — either way the original content is lost without any indication.

Consider skipping unknown parts instead of emitting an empty stub:
♻️ Proposed change
 		default:
-			chatParts = append(chatParts, dto.MediaContent{
-				Type: partType,
-			})
+			// Unknown Responses content type — skip rather than emit an empty stub
+			common.SysLog(fmt.Sprintf("convertResponsesContentToChat: skipping unknown part type %q", partType))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@service/openaicompat/responses_to_chat.go` around lines 383 - 394, The
default switch branch currently appends an empty dto.MediaContent{Type:
partType} which yields {"type":"<unknown>"} and loses content; change the
default to skip unknown part types (do not append to chatParts) and optionally
log a warning; update the switch in the function that builds chatParts (the code
that references chatParts, dto.MediaContent and partType) to only append when a
known payload is set and simply continue/ignore on unknown types instead of
emitting an empty stub.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@service/openaicompat/chat_stream_to_responses_stream.go`:
- Around line 83-96: The reasoning delta block uses a hardcoded outIndex = 0
which conflicts with the stream's output-index allocator; replace the literal
with a proper allocated index (use the same allocator/field used elsewhere, e.g.
call s.allocOutputIndex() or increment s.NextOutputIndex to produce a unique
index) and set that allocated value as OutputIndex for the reasoning event (keep
SummaryIndex separate). Also add a dedicated reasoning output index field to the
ChatToResponsesStreamState struct (initialise it to -1 in
NewChatToResponsesStreamState) so the reasoning index is tracked consistently
and not duplicated by ensureMessageItemEvents or other allocators.
- Around line 158-222: FinalEvents currently finalizes message and tool-call
events but never emits the paired reasoning done events; add a reasoning
finalization block in FinalEvents (similar to the "Finalize message item" block)
that checks the state fields used while streaming reasoning deltas (e.g.,
s.ReasoningText / s.ReasoningPartAdded / s.ReasoningContentAdded or whatever
names are present in the struct) and appends the corresponding
dto.ResponsesStreamResponse entries for "response.reasoning_summary_text.done"
and/or "response.reasoning_summary_part.done" (or call existing helper methods
like s.reasoningTextDoneEvent/s.reasoningPartDoneEvent if present) before
appending the message/tool completions and the final response so clients receive
matching done events for each reasoning delta.

---

Nitpick comments:
In `@service/openaicompat/responses_to_chat.go`:
- Around line 383-394: The default switch branch currently appends an empty
dto.MediaContent{Type: partType} which yields {"type":"<unknown>"} and loses
content; change the default to skip unknown part types (do not append to
chatParts) and optionally log a warning; update the switch in the function that
builds chatParts (the code that references chatParts, dto.MediaContent and
partType) to only append when a known payload is set and simply continue/ignore
on unknown types instead of emitting an empty stub.

coderabbitai · 2026-02-19T21:42:15Z

service/openaicompat/chat_stream_to_responses_stream.go

+	// Reasoning content (for models that emit reasoning_content)
+	reasoningContent := delta.GetReasoningContent()
+	if reasoningContent != "" {
+		outIndex := 0
+		summaryIndex := 0
+		events = append(events, dto.ResponsesStreamResponse{
+			Type:         "response.reasoning_summary_text.delta",
+			ResponseID:   s.ResponseID,
+			ItemID:       "rs_" + strings.TrimPrefix(s.ResponseID, "resp_"),
+			OutputIndex:  &outIndex,
+			SummaryIndex: &summaryIndex,
+			Delta:        reasoningContent,
+		})
+	}


⚠️ Potential issue | 🟡 Minor

Hardcoded outIndex = 0 conflicts with the output-index allocator for reasoning content.

When a response contains both reasoning content and text, outIndex for reasoning is always 0 as a local literal, never allocated via allocOutputIndex. If ensureMessageItemEvents fires later (also allocating MessageOutputIndex = s.NextOutputIndex, which starts at 0), both the reasoning events and the message item events share OutputIndex 0, producing conflicting SSE events for clients that parse OutputIndex strictly.

The fix is to allocate a dedicated index for the reasoning item through the same NextOutputIndex counter:

💡 Proposed fix – allocate a reasoning output index

+ // Reasoning content reasoningContent := delta.GetReasoningContent() if reasoningContent != "" { - outIndex := 0 + if s.ReasoningOutputIndex < 0 { + s.ReasoningOutputIndex = s.NextOutputIndex + s.NextOutputIndex++ + } + outIndex := s.ReasoningOutputIndex summaryIndex := 0 events = append(events, dto.ResponsesStreamResponse{

And add to the struct:

+ ReasoningOutputIndex int // -1 until allocated

initialised to -1 in NewChatToResponsesStreamState.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@service/openaicompat/chat_stream_to_responses_stream.go` around lines 83 - 96, The reasoning delta block uses a hardcoded outIndex = 0 which conflicts with the stream's output-index allocator; replace the literal with a proper allocated index (use the same allocator/field used elsewhere, e.g. call s.allocOutputIndex() or increment s.NextOutputIndex to produce a unique index) and set that allocated value as OutputIndex for the reasoning event (keep SummaryIndex separate). Also add a dedicated reasoning output index field to the ChatToResponsesStreamState struct (initialise it to -1 in NewChatToResponsesStreamState) so the reasoning index is tracked consistently and not duplicated by ensureMessageItemEvents or other allocators.

coderabbitai · 2026-02-19T21:42:15Z

service/openaicompat/chat_stream_to_responses_stream.go

+// FinalEvents emits the closing events: content done, tool calls done, and
+// response.completed.
+func (s *ChatToResponsesStreamState) FinalEvents(usage *dto.Usage) []dto.ResponsesStreamResponse {
+	events := s.baseEvents()
+
+	// Finalize message item
+	if s.MessageItemAdded {
+		text := s.OutputText.String()
+		if s.MessageContentAdded {
+			events = append(events, s.outputTextDoneEvent(text))
+			events = append(events, s.contentPartDoneEvent(text))
+		}
+		events = append(events, s.messageItemDoneEvent(text))
+	}
+
+	// Finalize tool calls
+	for _, callID := range s.ToolCallOrder {
+		outIndex := s.outputIndexPtr(callID)
+		args := s.ToolCallArgs[callID]
+		if args != "" {
+			events = append(events, dto.ResponsesStreamResponse{
+				Type:        "response.function_call_arguments.done",
+				ResponseID:  s.ResponseID,
+				ItemID:      callID,
+				OutputIndex: outIndex,
+				Arguments:   args,
+			})
+		}
+		events = append(events, dto.ResponsesStreamResponse{
+			Type:        "response.output_item.done",
+			ResponseID:  s.ResponseID,
+			ItemID:      callID,
+			OutputIndex: outIndex,
+			Item: &dto.ResponsesOutput{
+				Type:      "function_call",
+				ID:        callID,
+				Status:    "completed",
+				CallId:    callID,
+				Name:      s.ToolCallName[callID],
+				Arguments: args,
+			},
+		})
+	}
+
+	// Build final output and usage
+	output := s.buildFinalOutput()
+	finalUsage := s.buildFinalUsage(usage)
+
+	resp := &dto.OpenAIResponsesResponse{
+		ID:        s.ResponseID,
+		Object:    "response",
+		CreatedAt: int(s.CreatedAt),
+		Status:    "completed",
+		Model:     s.Model,
+		Output:    output,
+		Usage:     finalUsage,
+	}
+	events = append(events, dto.ResponsesStreamResponse{
+		Type:       "response.completed",
+		ResponseID: s.ResponseID,
+		Response:   resp,
+	})
+
+	return events
+}


⚠️ Potential issue | 🟡 Minor

FinalEvents emits no done event for reasoning content.

When reasoning deltas were emitted during streaming, FinalEvents finalises text and tool calls but never emits response.reasoning_summary_text.done (or response.reasoning_summary_part.done). Clients that wait for a paired done event for every delta will be left with an incomplete SSE sequence.

💡 Suggested addition in FinalEvents

+ // Finalize reasoning content + if s.ReasoningOutputIndex >= 0 { + summaryIndex := 0 + events = append(events, dto.ResponsesStreamResponse{ + Type: "response.reasoning_summary_text.done", + ResponseID: s.ResponseID, + ItemID: "rs_" + strings.TrimPrefix(s.ResponseID, "resp_"), + OutputIndex: func() *int { i := s.ReasoningOutputIndex; return &i }(), + SummaryIndex: &summaryIndex, + Text: s.ReasoningText.String(), // accumulate in HandleChatChunk + }) + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@service/openaicompat/chat_stream_to_responses_stream.go` around lines 158 - 222, FinalEvents currently finalizes message and tool-call events but never emits the paired reasoning done events; add a reasoning finalization block in FinalEvents (similar to the "Finalize message item" block) that checks the state fields used while streaming reasoning deltas (e.g., s.ReasoningText / s.ReasoningPartAdded / s.ReasoningContentAdded or whatever names are present in the struct) and appends the corresponding dto.ResponsesStreamResponse entries for "response.reasoning_summary_text.done" and/or "response.reasoning_summary_part.done" (or call existing helper methods like s.reasoningTextDoneEvent/s.reasoningPartDoneEvent if present) before appending the message/tool completions and the final response so clients receive matching done events for each reasoning delta.

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

0-don force-pushed the feat/responses-via-chat-completions branch from 924c334 to 5c69030 Compare February 19, 2026 21:32

coderabbitai bot reviewed Feb 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(responses): route /v1/responses through non-OpenAI adaptors via chat completions conversion#2970

feat(responses): route /v1/responses through non-OpenAI adaptors via chat completions conversion#2970
0-don wants to merge 1 commit intoQuantumNous:mainfrom
0-don:feat/responses-via-chat-completions

0-don commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot Feb 18, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

coderabbitai bot Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Uh oh!

Conversation

0-don commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why not use existing PRs directly?

vs #2892 (494 lines, Claude-only)

vs #2817 (1039 lines)

Our approach (1006 lines, adaptor-agnostic)

Related issues

Files changed

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

input_audio content part (Chat Completions API)

input_file content part (Responses API)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

0-don commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

`input_audio` content part (Chat Completions API)

`input_file` content part (Responses API)