ci: add instrumented test workflow on macOS AVD by yuga-hashimoto · Pull Request #461 · yuga-hashimoto/open-dash

yuga-hashimoto · 2026-04-19T23:37:26Z

Priority 5: Refactor / Quality

The L3 instrumented tests added in #459 / #460 ran locally but never on CI. This workflow boots an AVD on a macOS runner and runs `connectedStandardDebugAndroidTest` against API 34 / google_apis / x86_64.

Design choices

macos-latest, not ubuntu — nested-virt on ubuntu is ~5x slower than HVF on macOS. The minute cost is worth the wall-clock for blocking PR checks.
AVD snapshot cache via `actions/cache` so subsequent runs skip the 3-5 min cold image boot.
`concurrency` group cancels older in-flight runs on the same ref so we don't double-spend the AVD on stale shas.
Path filter so docs / unrelated changes don't trigger a 10-min run.
Standard flavor only — `full` adds the VOICEVOX native AAR which is irrelevant on an emulator and inflates memory.
Single matrix cell to start — confirm stability before fanning out to multiple API levels.
Reports + logcat uploaded on every run / failure for triage.

Heads-up: macOS runner cost

macOS GitHub Actions runners are billed at 10x ubuntu. Expected per-run wall-clock: ~10 min (cold) → ~5 min (warm AVD cache). If cost becomes a concern we can:

gate the workflow to only run when `app/src/androidTest/**` changes (already done via path filter)
move to ubuntu with KVM if you've enabled hardware-acceleration on your runner image (slower but cheaper)

Tell me if you'd rather start on ubuntu instead — easy switch.

Test plan

First run on this PR will exercise the workflow itself — expect AVD boot + execution of all 4 instrumented tests from test(e2e): scaffold instrumented test layer with Hilt + UiAutomator #459/test(e2e): full VoicePipeline fast-path E2E with @TestInstallIn fake TTS #460
Confirm artifacts (test report, logcat on failure) appear in the run page
If green, merge; subsequent PRs will use the cached AVD

## Priority 5: Refactor / Quality The L3 instrumented tests added in #459 / #460 ran locally but never on CI. This workflow boots an AVD on a macOS runner (HVF acceleration so boot is fast enough for PR feedback) and runs `connectedStandardDebugAndroidTest` against API 34 / google_apis / x86_64. Design choices: - macos-latest, not ubuntu — nested-virt on ubuntu is ~5x slower than HVF on macOS. The minute cost is worth the wall-clock for blocking PR checks. - AVD snapshot cached via actions/cache so subsequent runs skip the 3-5 min cold image boot. - `concurrency` cancels older in-flight runs on the same ref so we don't double-spend the AVD on stale shas. - Path filter so docs / unrelated changes don't trigger a 10-min run. - Standard flavor only — `full` adds the VOICEVOX native AAR which is irrelevant on an emulator and inflates memory. - Reports + logcat uploaded on every run / failure for triage. Skipped flavor matrix and multi-API for now — start with one cell so we can confirm stability before fanning out.

Initial macos-latest run failed because the GitHub macOS-14 image is Apple Silicon and can't host x86_64 emulators. ubuntu-latest with /dev/kvm permission tweak is the runner-currently-recommended path in the android-emulator-runner README and gives near-native boot without the arm64 emulator-image gymnastics.

@androidentrypoint

MainActivity is @androidentrypoint, so booting it inside HiltTestApplication without first materializing the test Hilt component throws 'The component was not created.' at onCreate. Surfaced by the first instrumented CI run (PR #461 / pre-merge): java.lang.RuntimeException: Unable to start activity ComponentInfo{com.opendash.app/com.opendash.app.MainActivity}: java.lang.IllegalStateException: The component was not created. Check that you have added the HiltAndroidRule. Adding @HiltAndroidTest + HiltAndroidRule + hiltRule.inject() in @before forces the component to build before MainActivity touches Hilt.

@ignore

CI emulator surfaced two issue classes: 1. runTest uses virtual time. Production code under test (VoicePipeline, AndroidSttProvider Channel.consumeAsFlow, etc.) uses real Dispatchers + delay(), so withTimeout(...) fires immediately and Flow collectors hang waiting for channel close that never arrives in virtual time. Replaced runTest with runBlocking in: - VoicePipelineFastPathE2ETest (both methods) - AssistantProviderE2ETest (setUp + both methods) - HiltInjectionE2ETest - LatencyBudgetE2ETest - FakeSttPipelineE2ETest (3 methods) runBlocking is the right primitive for instrumented tests against real Android dispatchers; runTest is for pure-Kotlin suspending units. 2. AppLaunchE2ETest never reaches the foreground assertion on a fresh AVD. The launch path triggers ModelSetupScreen which kicks off a HuggingFace model fetch — emulator network conditions in CI make that unreliable, and stubbing it requires plumbing we haven't built yet. @ignore'd with a follow-up note rather than rewriting in this PR. The Hilt rule fix from the previous commit is kept so the test compiles and re-enables cleanly later.

This was referenced Apr 19, 2026

test(e2e): SpeechToText @TestInstallIn swap + FakeSpeechToText #462

Merged

test(e2e): full LLM round-trip with FakeAssistantProvider #463

Merged

test(metrics): LatencyAssertions helpers + budget E2E guard #464

Merged

yuga-hashimoto added 3 commits April 20, 2026 08:58

yuga-hashimoto force-pushed the ci/android-emulator branch from f09c3be to 5afa450 Compare April 19, 2026 23:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add instrumented test workflow on macOS AVD#461

ci: add instrumented test workflow on macOS AVD#461
yuga-hashimoto wants to merge 4 commits into
mainfrom
ci/android-emulator

yuga-hashimoto commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yuga-hashimoto commented Apr 19, 2026

Priority 5: Refactor / Quality

Design choices

Heads-up: macOS runner cost

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant