[Don't merge] smoke tests flake bust#323242
Draft
pwang347 wants to merge 21 commits into
Draft
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds new smoke coverage around chat model configuration (reasoning effort + context size) and updates the chat model picker/mocking infrastructure to support asserting forwarded request payloads. The changes span smoke tests + automation drivers, plus workbench chat model picker behavior and the chat-simulation mock server.
Changes:
- Add new smoke suites that exercise model configuration in both the Chat panel and Agents Window (Local session), including request-body assertions via a request-capturing mock server.
- Extend smoke harness utilities with a pre-start hook and a best-effort storage pre-seed to avoid first-send flakiness.
- Update the workbench chat model picker (and its tests) plus enhance the mock LLM server model fixtures and request capture support.
Show a summary per file
| File | Description |
|---|---|
| test/smoke/src/utils.ts | Adds app beforeStart hook support and a best-effort storage DB pre-seed helper for chat extension enablement. |
| test/smoke/src/main.ts | Wires the new chat model configuration smoke suite into the default smoke run. |
| test/smoke/src/areas/chat/chatModelConfig.test.ts | New smoke suite for Chat panel model configuration and request-body verification. |
| test/smoke/src/areas/agentsWindow/agentsWindow.test.ts | Adds a new Agents Window suite to verify Local session model configuration forwarding. |
| test/automation/src/chat.ts | Adds automation helpers for selecting models/config options and reading context usage details in panel chat. |
| test/automation/src/agentsWindow.ts | Adds parallel automation helpers scoped to the Agents Window active session input. |
| src/vs/workbench/contrib/chat/test/browser/widget/input/chatModelPicker.test.ts | Updates unit tests to match the model picker API/signature changes. |
| src/vs/workbench/contrib/chat/browser/widget/input/chatModelPicker.ts | Adjusts model picker item construction, labels, and hover cost/config presentation logic. |
| scripts/chat-simulation/common/mock-llm-server.ts | Adds typed model fixtures, a configurable request-capture mechanism, and a dedicated mock config model. |
Review details
- Files reviewed: 9/9 changed files
- Comments generated: 5
- Review effort level: Low
Comment on lines
784
to
+788
| if (path === '/chat/completions' && req.method === 'POST') { | ||
| readBody().then((body: string) => handleChatCompletions(body, res)); | ||
| readBody().then((body: string) => { | ||
| serverEvents.emit('capturedRequest', { path, method: 'POST', body }); | ||
| return handleChatCompletions(body, res); | ||
| }); |
Comment on lines
796
to
+800
| if (path === '/responses' && req.method === 'POST') { | ||
| readBody().then((body: string) => handleResponsesApi(body, res)); | ||
| readBody().then((body: string) => { | ||
| serverEvents.emit('capturedRequest', { path, method: 'POST', body }); | ||
| return handleResponsesApi(body, res); | ||
| }); |
Comment on lines
+228
to
+231
| itRepeat(10, 'forwards the selected reasoning effort and context size to the server', async function () { | ||
| const app = this.app as Application; | ||
| const chat = app.workbench.chat; | ||
|
|
Comment on lines
+569
to
+571
| itRepeat(10, 'forwards the selected reasoning effort and context size from the Local session', async function () { | ||
| const app = this.app as Application; | ||
|
|
| if (!opts.web) { setupChatTests(logger); } | ||
| if (!opts.web && !opts.remote && quality !== Quality.Dev && quality !== Quality.OSS) { setupCopilotCliTests(logger); } | ||
| if (!opts.web && !opts.remote) { setupChatSessionsTests(logger); } | ||
| if (!opts.web && !opts.remote) { setupChatModelConfigTests(logger); } |
The context-usage details popup is a sticky, locked, focus-trapping
body-level hover that overlaps the model-config button. A single Escape
could return while the popup was mid-teardown but still attached, so the
next forced click on the config button landed on the popup's action
buttons instead of opening the dropdown — wedging it open without rows
('Timed out opening the model configuration dropdown').
Add dismissContextUsageDetails() that presses Escape until the popup is
fully detached (count === 0), and call it at the start of openModelConfig
and in readContextUsageTokenLabel (both success and retry paths).
CI traces proved the previous approach futile: clicking the context-usage
gauge opens the details popup as a sticky hover (persistence.sticky =>
HoverWidget.isLocked=true), focus-trapping and not closable by Escape (30+
Escape presses in the trace left '.chat-context-usage-details' attached the
whole time). The locked overlay then sits over the model-config button, so the
next case's openModelConfig force-click lands on the popup and the dropdown
never opens ('Timed out opening the model configuration dropdown', always on
the case right after readContextUsageTokenLabel).
Read the label via the gauge's delayed (mouse) hover instead: same details
node but non-sticky/non-locked, so moving the pointer off the gauge auto-hides
it. dismissContextUsageDetails now parks the pointer at (0,0) and waits for the
overlay to detach (no Escape). Applied to chat.ts and agentsWindow.ts.
The non-sticky hover read was right (no more openModelConfig wedge), but widget.hover() timed out 30s on ~half the iterations: the gauge expands on hover and its progress arc animates, so Playwright's actionability (stability) check never settles, and the single un-timed hover() ate the whole deadline, defeating the retry loop. Trigger the delayed hover with a raw pointer move to the gauge's center (boundingBox + page.mouse.move), moving the pointer away first to guarantee a fresh mouse-enter on every attempt. No actionability checks, deterministic re-trigger. Applied to chat.ts and agentsWindow.ts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.