Skip to content

test(e2e): full LLM round-trip with FakeAssistantProvider#463

Merged
yuga-hashimoto merged 1 commit into
mainfrom
feat/e2e-assistant-swap
Apr 19, 2026
Merged

test(e2e): full LLM round-trip with FakeAssistantProvider#463
yuga-hashimoto merged 1 commit into
mainfrom
feat/e2e-assistant-swap

Conversation

@yuga-hashimoto

Copy link
Copy Markdown
Owner

Priority 5: Refactor / Quality (testing the priority-2 contract)

Builds on #460. Drives the real `VoicePipeline` through a non-fast-path utterance so it falls into the LLM agent loop, with a scripted `FakeAssistantProvider` standing in for the embedded LLM.

No DI module changes needed: `ConversationRouter` already accepts dynamically registered providers via `registerProvider`, and Manual policy lets a test pin its own fake as the resolved provider.

What changed

Tests (`app/src/androidTest/...`)

  • `e2e/fakes/FakeAssistantProvider` — AssistantProvider impl that returns scripted responses from a queue, records every `send()` call into `sentMessages` for assertions. `capabilities.isLocal=true` so the Auto policy never gates this provider on the emulator's network state
  • `e2e/AssistantProviderE2ETest` —
    • `llm_response_is_spoken_via_tts` — register the fake, queue an Assistant message, drive `processUserInput("explain quantum computing")` (an ambiguous-info utterance that fast-path ignores), assert (a) the user message was sent to the provider, (b) the canned reply was spoken via `FakeTextToSpeech`
    • `fake_provider_is_resolved_under_manual_policy` — guards against silent re-routing if the production model-download path registers a real provider mid-suite

Why this matters

This is the first L3 test that exercises the priority-2 contract (local-LLM agentic path) end-to-end through the production graph. The L1 unit tests can't catch a router-vs-pipeline integration regression here because they don't bind the actual `VoicePipeline` singleton.

Test plan

  • `./gradlew assembleStandardDebugAndroidTest` — green
  • `./gradlew assembleStandardDebug testStandardDebugUnitTest` — green
  • Once ci: add instrumented test workflow on macOS AVD #461 (CI emulator workflow) merges, `connectedStandardDebugAndroidTest` will run this in CI

## Priority 5: Refactor / Quality

Builds on #460. Drives the **real** VoicePipeline through a non-fast-
path utterance so it falls into the LLM agent loop, with a scripted
FakeAssistantProvider standing in for the embedded LLM.

No DI module changes needed: ConversationRouter already accepts
dynamically registered providers via registerProvider, and Manual
policy lets a test pin its own fake as the resolved provider.

Test additions:
- FakeAssistantProvider: AssistantProvider impl that returns scripted
  responses from a queue, records every send() call into sentMessages
  for assertions. capabilities.isLocal=true so the Auto policy never
  gates this provider on the emulator's network state.
- AssistantProviderE2ETest:
  * llm_response_is_spoken_via_tts — register the fake, queue an
    Assistant message, drive `processUserInput("explain quantum
    computing")` (an ambiguous-info utterance that fast-path ignores),
    assert (a) the user message was sent to the provider, (b) the
    canned reply was spoken via FakeTextToSpeech.
  * fake_provider_is_resolved_under_manual_policy — guards against
    silent re-routing if the production model-download path registers
    a real provider mid-suite.

This is the first L3 test that exercises the priority-2 contract
(local-LLM agentic path) end-to-end through the production graph.
The L1 unit tests can't catch a router-vs-pipeline integration
regression here because they don't bind the actual VoicePipeline
singleton.

Verification:
- ./gradlew assembleStandardDebugAndroidTest — green
- ./gradlew assembleStandardDebug testStandardDebugUnitTest — green
@yuga-hashimoto yuga-hashimoto merged commit 4c50ea6 into main Apr 19, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant