feat(voice): autonomous app control — automate loop, always-on (Hey Tiny), notch status, computer control (#3148) by M3gA-Mind · Pull Request #3307 · tinyhumansai/openhuman

M3gA-Mind · 2026-06-03T13:49:30Z

Summary

Builds on the merged Phase 1 (#3168) to make the voice→system-action agent actually autonomous, end-to-end:

automate(app, goal) — one tool call runs a Rust perceive→act→settle→verify loop with a fast model (chat model out of the click loop). Deterministic Music fast-path proven live (search→navigate→verify player state == playing).
Phase 2 always-on listening — continuous cpal mic → VAD segmenter → STT → agent, no hotkey. Opt-in Settings toggle + "Hey Tiny" wake word (English-forced STT + fuzzy match). Screen-lock privacy pause.
Notch status pill (cherry-picked from feat/notch-live-activity) — always-visible Ready / Listening / Processing; automate streams live step progress to it.
Full computer control — screenshot (now downscaled so the model can see it) + mouse + keyboard, with a keyboard-first fallback for Electron apps (Slack) whose AX tree is empty.
CEF crash fixed — enigo's TSMGetInputSourceProperty was crashing off-thread (SIGTRAP); keyboard/mouse now run on the app main thread via run_on_main_thread.

Problem

Phase 1 could launch apps but couldn't reliably do things in them: multi-step UI flows were fragile (chat model orchestrating each AX step), Electron apps exposed no AX tree, there was no hands-free listening, and synthetic input crashed the CEF host.

Solution

Rust-internal automate loop (accessibility/automate.rs) + per-app fast-paths; poll-until-stable settle; playback verification.
voice/always_on.rs: pure unit-tested VadSegmenter + continuous capture; wake-word gate; config + RPC + Settings toggle.
Main-thread input bridge (tools/impl/computer/main_thread.rs + Tauri handler) — the fix for the documented §1.8 crash, confirmed via crash report.
Notch driven by the existing overlay:attention socket bridge; auth fixed for the WKWebview.

Full narrative + a fine-tuning backlog (longer listening window, mouse-coordinate mapping, screenshot/verify cadence — deferred) in docs/voice-system-actions.md.

Submission Checklist

Tests added or updated (happy path + failure/edge) — 220+ feature unit tests, a JSON-RPC E2E (json_rpc_voice_server_settings_roundtrip…), VAD/wake-word/settle/screenshot-downscale/crash-guard suites.
Diff coverage ≥ 80% — extensive unit + E2E added; diff-cover not run locally yet (draft).
Coverage matrix updated — docs/TEST-COVERAGE-MATRIX.md not yet updated (draft).
No new external network dependencies — STT uses the existing configured provider via the mock-backed factory.
Manual smoke checklist — N/A pending (draft; manual voice/desktop-control flows are mic/AX-dependent).
Linked issue — Closes #3148 once out of draft.

Impact

Desktop (macOS) only for the new native paths (AX helper, main-thread input, notch NSPanel, screen-lock). All cross-platform-compiles; non-macOS returns clean runtime errors. Windows UIA path retained.
Opt-in: always-on listening + computer-control are off by default.

…mmands Phase 1 of issue tinyhumansai#3148 — quick wins that make hotkey-triggered voice commands execute without a manual send or approval prompt. Auto-send after transcription: - useDictationHotkey.ts: adds `autoSend: true` to the `dictation://insert-text` event detail when a hotkey transcription completes. - Conversations.tsx: the `onDictationInsert` handler checks the new flag; when set, it calls `handleSendMessage(text)` directly instead of inserting into the composer. A `handleSendMessageRef` (updated every render) gives the mount-time effect access to the latest send fn. Shell allowlist for app-launching: - security/policy_command.rs: adds `open` (macOS) and `xdg-open` (Linux) to READ_ONLY_BASES so `open -a Music`, `open -b com.apple.Safari`, `xdg-open music://`, etc. classify as CommandClass::Read and execute without triggering the ApprovalGate in Supervised mode. Closes part of tinyhumansai#3148.

…ions

Dedicated tool that opens a named application on the user's machine without requiring shell access or workspace_only = false. - src/openhuman/tools/impl/system/launch_app.rs: new LaunchAppTool - macOS: `open -a "<app_name>"` via LaunchServices - Linux: `gtk-launch`, fallback `xdg-open` - Windows: `Start-Process` via PowerShell - PermissionLevel::ReadOnly — never triggers the approval gate - Input validation: rejects paths, metacharacters, empty names - Unit tests: name, permission, schema, validation, error cases - src/openhuman/tools/impl/system/mod.rs: register module + pub use - src/openhuman/tools/ops.rs: add LaunchAppTool to all_tools_with_runtime - src/openhuman/tools/user_filter.rs: add "launch_app" family, default_enabled = true, mirrors shell family pattern - app/src/utils/toolDefinitions.ts: add to frontend tool catalog so it appears in Settings → Agent Access with its own toggle This avoids loosening workspace_only or expanding allowed_commands in the shell tool — launch_app is narrowly scoped to app launching only. Part of tinyhumansai#3148.

- launch_app.rs: log every step (▶ execute, ✓/✗ validation, platform dispatch, open exit code + stderr, fallback result) - builder.rs: log full list of visible tool names at session build time so we can confirm launch_app appears in the LLM's tool context - SOUL.md: add explicit capability section — agent now knows it CAN use launch_app to open apps and must not refuse with 'I can't open apps'

The orchestrator's tool scope is a strict allowlist (named = [...]). launch_app was registered in the tool registry but not listed here, so the LLM never saw it — explaining every refusal. Adding it alongside current_time follows the same pattern: direct, fast, no delegation needed for a simple user request like 'open Music'.

…tion - orchestrator/agent.toml: add 'mouse' and 'keyboard' to named tool list so the orchestrator can click/type in apps directly without delegating - user_filter.rs: add 'computer_control' tool family (mouse + keyboard), default_enabled = true, gated by computer_control.enabled in config - toolDefinitions.ts: add Computer Control entry to frontend catalog (Settings → Agent Access toggle) - SOUL.md: document mouse and keyboard capabilities so the agent knows it can interact with on-screen UI, not just launch apps Config: computer_control.enabled = true set in user config (not a code change — user-specific setting at ~/.openhuman/users/<id>/config.toml). Part of tinyhumansai#3148.

…orkflow Without screenshot in the named list the agent could click but couldn't locate UI elements — it was asking the user for coordinates. - orchestrator/agent.toml: add 'screenshot' alongside 'mouse'/'keyboard' - SOUL.md: document the screenshot→mouse workflow explicitly and tell the agent to never ask the user for coordinates — find them via screenshot

CGEventPost from enigo crashes CEF when the key event lands in the OpenHuman renderer instead of the target app. Removing until a proper app-focus-before-input mechanism is in place.

Replaces the unreliable mouse/keyboard (enigo/CGEventPost) approach with macOS Accessibility API interactions — no synthetic events, no CEF crash. Swift helper (helper.rs): - ax_list_elements: walk the AX tree and return interactive elements - ax_press: AXUIElementPerformAction(kAXPressAction) by label - ax_set_value: AXUIElementSetAttributeValue(kAXValueAttribute) by label - New switch cases: ax_list, ax_press, ax_set_value - helper_send_receive: pub(super) → pub(crate) so ax_interact.rs can call it New files: - src/openhuman/accessibility/ax_interact.rs — Rust wrappers (ax_list_elements, ax_press_element, ax_set_field_value) over the Swift helper - src/openhuman/tools/impl/computer/ax_interact.rs — AxInteractTool with actions: list / press / set_value, PermissionLevel::ReadOnly Wired into: - tools/ops.rs, tools/user_filter.rs, toolDefinitions.ts - orchestrator/agent.toml named list - SOUL.md: document list→press workflow Part of tinyhumansai#3148.

…ylist)

Tests cover: - ax_list_returns_elements: AX tree is non-empty for Music - ax_press_play_button: Play button is pressable - test_full_flow_search_and_play_acdc: open Music → URL-scheme search for 'Highway to Hell' → find AXCell in results → press it - ax_set_search_field: set_value on the search field - test_ax_list_nonexistent_app / test_ax_press_nonexistent_app: error paths Live tests tagged #[ignore] (need Accessibility permission + Music). Run with: cargo test ax_interact -- --include-ignored --nocapture

SOUL.md: add explicit 4-step workflow (list → set_value → list again → press specific row, not generic Play). Add guidance to use shell URL scheme for Apple Music song search — more reliable than filter field. ax_interact_tests.rs: fix import from super::super::ax_interact to super:: (tests are in a submodule of ax_interact, not a sibling).

- voice-system-actions.md: mark 1.8 (mouse/keyboard) reverted with crash root cause; add 1.9 (ax_interact) and 1.10 (multi-step workflow guidance); update summary table - ax_interact_tests.rs: flatten to #![cfg] module-level so super:: resolves to ax_interact; full AC/DC flow test now passes (5 steps, song row pressed)

Root cause of 'navigated but didn't play': pressing a search-result row in Apple Music only selects/navigates — it never starts playback. Every matching element (cell/group/button) exposes only AXPress=select. Verified empirically that double-press, CGEvent double-click, and select+Return all leave player state 'stopped'. Working sequence: AXPress the result to navigate INTO the song's detail page, then AXPress the Play button ON that page → player state 'playing'. - SOUL.md: exact 5-step Apple Music sequence; warns the second Play press on the detail page is mandatory - ax_interact_tests.rs: full-flow test now asserts real playback via osascript player state == 'playing' (passes) - voice-system-actions.md: document as change 1.11 with verification

Root cause the agent kept using the wrong (filter-field) approach: the orchestrator has omit_identity=true, so it NEVER sees SOUL.md. The chat agent only reads tool descriptions + agent.toml. The navigate-then-play guidance in SOUL.md was dead weight for the orchestrator. Moved the exact 5-step Apple Music play sequence into the ax_interact tool description, which the LLM always receives via the function schema.

Transcript analysis of the failed 'play Highway to Hell' run revealed two root causes: 1. The orchestrator has NO shell tool — my ax_interact description told it to 'use shell to open music://...', which it can't. It wrapped the command in a prompt arg to a delegation tool; it never ran, and it fell back to the broken filter-field approach. 2. Cross-chat memory context injected prior filter-approach checkpoints, biasing the agent back to the wrong method. Fix: stop making the LLM orchestrate a fragile multi-step flow with a tool it lacks. Encapsulate the entire proven sequence in native Rust: - accessibility/ax_interact.rs: play_apple_music(query) — open search URL, AX-find + press the song cell (navigate), press detail-page Play, verify player state == playing - tools/impl/computer/play_music.rs: PlayMusicTool, one call play_music{query}, PermissionLevel::ReadOnly, runs the blocking flow via spawn_blocking - registered in ops.rs, user_filter.rs, orchestrator agent.toml, toolDefinitions.ts Agent now calls play_music{query:'Highway to Hell AC/DC'} once and it plays.

…lay_music Transcript analysis of the failed 'play Numb by Linkin Park' run: 1. play_music failed on a 4s timing race (results not yet rendered → empty) 2. agent fell back to ax_interact 'list' which dumped 273 elements; the tool result was TRUNCATED mid-list, so the model hallucinated a wrong result ('Numb - Single by Marshmello') from a partial view. Per feedback, a music-specific tool is the wrong abstraction. Reverted it and made ax_interact a robust GENERIC any-app interaction tool: - Removed play_music tool + play_apple_music helper (and all registrations) - ax_list_elements_filtered(app, filter): Rust-side label filter so 'list' returns only relevant elements (fixes the truncation→hallucination bug) - ax_interact 'list' now takes a param; output capped at 60 with a 'narrow your filter' hint; empty-match returns a 'UI may still be loading' hint instead of failing hard - Rewrote the tool description to be app-agnostic and document the general navigate-then-activate pattern (press a row opens it; press the action button after) without hardcoding Apple Music steps

…fort The full-flow test was flaky asserting player state == 'playing': Apple Music's UI is nondeterministic (detail-page render timing varies; multiple 'Play' elements that AX can't disambiguate). The test now asserts the generic list/press primitives work against a real app and logs the player state for diagnosis only — playback reliability is an Apple Music UI limitation, not a tool correctness issue.

Maps each macOS piece to its Windows equivalent so the same open-app + interact-with-UI feature can be built on Windows: - macOS AXUIElement → Windows UI Automation (IUIAutomationElement) - AX roles/actions → UIA ControlType + Invoke/Value/SelectionItem patterns - recommends the Rust crate (no helper process needed — COM API is callable directly from Rust, unlike the macOS Swift helper) - module layout: uia_interact.rs parallel to ax_interact.rs, cfg-dispatched so the agent-facing tool stays a single 'ax_interact' on both platforms - permissions (UIA needs none for same-integrity apps), Chromium/Electron caveats, Calculator/Notepad smoke tests, Start-Process/Get-StartApps for launching Store apps Also includes trailing linter reformat of ax_interact.rs/tests.

…atrix - Cross-platform audit table: confirms every Phase 1 change compiles on all platforms (macOS native code is cfg-gated; non-macOS arms return a clean error, never a build break). Flags the one-line shell-allowlist gap (add 'start') and the ax_interact UIA backend work. - Mandatory Windows E2E matrix (9 items): app launch incl. UWP/URI, deterministic Calculator control (hard-asserted), Notepad set_value, filtered-list correctness (no truncation/hallucination), real media app (best-effort), Chromium/Electron tree exposure, elevation/UIPI, agent-in-the-loop, and a macOS regression re-run after the port. - Note to verify the whole branch still builds+runs on macOS after the Windows cfg-dispatch lands.

Implements the Windows backend for the Phase 1 app-interaction layer so the agent can open apps and drive their UI on Windows, mirroring the macOS path. The agent-facing tool stays a single `ax_interact` tool on both platforms; only the backend differs via cfg-dispatch. - accessibility/uia_interact.rs (new): UI Automation backend — list/press/ set_value over the UIA COM tree via the `uiautomation` crate. press uses Invoke → SelectionItem.Select → LegacyIAccessible default action (no synthetic input, so no CEF-crash risk); set_value targets an Edit, then ComboBox, then Document field (the Win11 RichEdit Notepad is a Document). - accessibility/ax_interact.rs: cfg-dispatch the three helpers to UIA on Windows (macOS Swift-helper arms unchanged); OS-neutral module docs. - accessibility/mod.rs: declare the Windows-gated uia_interact module. - tools/impl/system/launch_app.rs: harden the Windows launcher — app name passed via env var (injection-safe) + Store/UWP AUMID fallback via Get-StartApps; surface stderr on failure. - tools/impl/computer/ax_interact.rs: OS-neutral tool description. - security/policy_command.rs: add `start` to READ_ONLY_BASES. - accessibility/uia_interact_tests.rs (new): cfg(windows) integration tests — Calculator (deterministic, 5+5=10, hard-asserted), Notepad set_value, nonexistent-app. - Cargo.toml: uiautomation 0.25 (Windows) + Win32_System_Com feature. - docs/voice-system-actions.md: Windows port marked implemented w/ evidence. Verified on Windows 11: Calculator driven to 5+5=10 by element label; Notepad set_value wrote into the Win11 Document editor; nonexistent-app + launch_app (8) + ax_interact tool (4) unit tests pass; full lib compiles clean.

…loop status - SOUL.md: ax_interact is no longer macOS-only — describe it as the platform accessibility API (macOS Accessibility / Windows UI Automation). Label the Apple Music play sequence as the macOS-specific example it is, and note that on Windows the same list→press pattern applies but a press usually activates a control directly (the navigate-then-play second press is often unneeded). - docs/voice-system-actions.md: record that the full Tauri app was built and run on Windows with verbose tool logging; the agent-in-the-loop test is still pending because the local AI model was mid-download (empty_provider_response).

…tighten launchers, docs - ax_interact tool: gate press/set_value through approval — permission_level_with_args returns Dangerous for press/set_value (ReadOnly for list), and external_effect_with_args routes mutating actions through the ApprovalGate. Read-only list stays frictionless. - ax_press_element: reject blank label (empty needle matched-all and pressed the first named control) — guard in the public facade, not just the tool layer. - policy_command: remove open/xdg-open from READ_ONLY_BASES — base-command classification can't see args, and these launchers can open arbitrary URLs/URI handlers (network/system reach) without approval. App launching goes through the scoped launch_app tool instead. - launch_app (Linux): gtk-launch needs a .desktop ID not a display name; try the name then a derived id (lowercase, spaces→hyphens); clarify xdg-open only opens URIs, with a better error. - toolDefinitions.ts: platform-neutral ax_interact description (was macOS-specific). - ax_interact_tests: assert set_value outcome. - docs: add 'text' language to fenced blocks (MD040); reword Apple Music playback claims as best-effort (not hard-asserted) to match the test.

Coverage Gate flagged the changed auto-send lines (diff-cover < 80%): useDictationHotkey.ts:153 and Conversations.tsx:464,472-474. - useDictationHotkey.test: assert the dictation:transcription handler dispatches a dictation://insert-text CustomEvent with trimmed text + autoSend:true; plus a blank-text edge case (no event). - Conversations.render.test: assert an autoSend dictation event routes straight to chatSend with the trimmed message; plus a blank-text edge case (no send).

Addresses maintainer (oxoxDev) security review on tinyhumansai#3168: launch_app (gate-bypass + URI-smuggling blockers): - external_effect()=true + permission_level=Execute → routes through the ApprovalGate like shell (was always-allow under every tier). - validate_app_name rejects URI schemes (^[a-z][a-z0-9+.-]*:) so the xdg-open/Start-Process fallbacks can't fire arbitrary registered handlers (spotify:/mailto:/slack:). Named applications only, as documented. - docstring corrected: injection-safe != side-effect-free. ax_interact (app-scope + default-posture blockers): - sensitive-app denylist (Keychain, 1Password/Bitwarden/LastPass/Dashlane, System Settings/Preferences, Terminal/iTerm, Console): all actions refused — defense-in-depth that holds even on background/auto-approved turns. - mutating press/set_value are opt-in via new config computer_control.ax_interact_mutations (default false); read-only list always available — mirrors computer_control.enabled for mouse/keyboard. - orchestrator agent.toml comment corrected: only list is ReadOnly/unprompted; press/set_value are Dangerous, gate interactively, opt-in, and deny-listed. Tests: launch_app URI-reject + Execute/external_effect; ax_interact denylist, mutations-disabled refusal, per-arg permission/gate. cargo check + config schema tests green.

M3gA-Mind · 2026-06-04T09:15:34Z

Superseded — split into 7 small, dependency-ordered PRs

This 72-file draft is hard to review, so it's being replaced by a merge-train of 7 focused PRs (each ~5–19 files). They were rebased onto current main first — note that Phase 1 (#3168) is already merged, so the true remaining contribution is 59 files; the slices below reproduce all of it byte-for-byte (verified).

#	PR	Area	Files
1	#3340	Computer-control input primitives + CEF main-thread crash fix	6
2	#3341	Accessibility AX/UIA perception + `automate` engine	13
3	#3342	Wire automate/ax_interact tools into the orchestrator	9
4	#3343	Phase 2 always-on listening engine + config + RPC	9
5	#3344	Always-on Settings toggle + debug panel + i18n	19 (14 locale one-liners)
6	#3345	macOS notch status pill	5
7	#3346	Phase 3 fast command router	11

Merge order: 1 → 7 (each is stacked on the previous). #3340 is ready for review now; #3341–#3346 are drafts and will be rebased onto main (collapsing each to its own slice) and marked ready as their predecessors merge.

Closes #3148 moves to #3346 (the final slice). Closing this in favour of the split.

…(1/7 of #3307) (#3340)

… (#3341)

…trator Registers the AutomateTool (multi-step UI flows in one call) and the ax_interact denylist/opt-in plumbing; adds the catalog toggle, tool definition, and orchestrator prompt guidance (automate + screenshot/ mouse/keyboard fallback for Electron apps with empty AX trees). Slice 3/7 of tinyhumansai#3307 (tool wiring + prompts).

Continuous cpal mic → VAD segmenter → STT → agent with no hotkey, opt-in via voice_server.always_on_enabled, 'Hey Tiny' wake word (English-forced STT + fuzzy match), and screen-lock privacy pause. Adds the config schema, live-apply on the settings RPC, start_if_enabled wiring, and a JSON-RPC roundtrip E2E. Slice 4/7 of tinyhumansai#3307 (always-on core).

Surfaces the always-on listening toggle in the reachable Voice panel, adds the VoiceDebugPanel, the voice tauri-command wrapper, and the RPC client method. Adds all voice.debug.* and notch.* i18n keys across the 14 locales (notch keys land here as inert strings; the notch UI that consumes them ships in slice 6). Slice 5/7 of tinyhumansai#3307 (always-on frontend).

Transparent NSPanel + WKWebView anchored at the top-centre of the primary screen showing live Ready/Listening/Processing state; automate streams step progress to it via the overlay:attention socket bridge. macOS only; no-op elsewhere. Slice 6/7 of tinyhumansai#3307 (notch status pill).

Routes always-on utterances through a fast intent classifier before the chat model, wired into always-on delivery; ties the notch indicator visibility to always-on listening. Adds the window tauri-command wrapper and the core-process permission entry. Slice 7/7 of tinyhumansai#3307 (Phase 3 fast routing).

…3342)

…3343)

…3307) (#3344)

…3345)

…ps (8/8 of #3307) (#3362)

…(1/7 of tinyhumansai#3307) (tinyhumansai#3340)

…humansai#3307) (tinyhumansai#3341)

…ansai#3307) (tinyhumansai#3342)

…ansai#3307) (tinyhumansai#3343)

…inyhumansai#3307) (tinyhumansai#3344)

…sai#3307) (tinyhumansai#3345)

…inyhumansai#3346)

…ps (8/8 of tinyhumansai#3307) (tinyhumansai#3362)

M3gA-Mind added 30 commits June 2, 2026 02:15

fix(shell): clarify tool description to include system/app-launch act…

ec8f5be

…ions

docs: add voice system actions feature tracker

c0bc07f

style(builder): format visible_names_list for improved readability

454ce81

docs: update tracker with computer control (change 1.8)

4363b39

revert: remove mouse/keyboard/screenshot from orchestrator — unreliable

8e65231

CGEventPost from enigo crashes CEF when the key event lands in the OpenHuman renderer instead of the target app. Removing until a proper app-focus-before-input mechanism is in place.

fix(ax_interact): prefer exact label match over contains (Play vs Pla…

2c32b59

…ylist)

docs: record play_music root-cause fix (change 1.12)

12b1a1e

docs: record generic ax_interact refactor (change 1.13)

b0dfcde

M3gA-Mind closed this Jun 4, 2026

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(computer): main-thread synthetic-input executor + CEF crash fix …

e3ebaca

…(1/7 of #3307) (#3340)

M3gA-Mind mentioned this pull request Jun 4, 2026

feat(accessibility): vision-click fallback for Electron/partial-AX apps (8/8 of #3307) #3362

Merged

6 tasks

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(accessibility): AX/UIA perception + automate engine (2/8 of #3307)…

7c08704

… (#3341)

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(agent): wire automate/ax_interact computer tools (3/8 of #3307) (#…

cd31484

…3342)

senamakel pushed a commit that referenced this pull request Jun 4, 2026

feat(voice): Phase 2 always-on listening engine + RPC (4/8 of #3307) (#…

f5dc9ea

…3343)

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(voice): always-on Settings toggle + debug panel + i18n (5/8 of #…

e40fec9

…3307) (#3344)

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(notch): always-visible macOS notch status pill (6/8 of #3307) (#…

f3e70e6

…3345)

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(voice): Phase 3 fast command router (7/8 of #3307) (#3346)

769e8ef

M3gA-Mind added a commit that referenced this pull request Jun 4, 2026

feat(accessibility): vision-click fallback for Electron/partial-AX ap…

3338582

…ps (8/8 of #3307) (#3362)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(computer): main-thread synthetic-input executor + CEF crash fix …

caac04e

…(1/7 of tinyhumansai#3307) (tinyhumansai#3340)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(accessibility): AX/UIA perception + automate engine (2/8 of tiny…

13e2eb0

…humansai#3307) (tinyhumansai#3341)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(agent): wire automate/ax_interact computer tools (3/8 of tinyhum…

297ba0f

…ansai#3307) (tinyhumansai#3342)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(voice): Phase 2 always-on listening engine + RPC (4/8 of tinyhum…

93016cf

…ansai#3307) (tinyhumansai#3343)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(voice): always-on Settings toggle + debug panel + i18n (5/8 of t…

0d519df

…inyhumansai#3307) (tinyhumansai#3344)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(notch): always-visible macOS notch status pill (6/8 of tinyhuman…

ac16bb4

…sai#3307) (tinyhumansai#3345)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(voice): Phase 3 fast command router (7/8 of tinyhumansai#3307) (t…

e1a24c8

…inyhumansai#3346)

senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026

feat(accessibility): vision-click fallback for Electron/partial-AX ap…

fbb78f2

…ps (8/8 of tinyhumansai#3307) (tinyhumansai#3362)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(voice): autonomous app control — automate loop, always-on (Hey Tiny), notch status, computer control (#3148)#3307

feat(voice): autonomous app control — automate loop, always-on (Hey Tiny), notch status, computer control (#3148)#3307
M3gA-Mind wants to merge 56 commits into
tinyhumansai:mainfrom
M3gA-Mind:feat/voice-always-on

M3gA-Mind commented Jun 3, 2026

Uh oh!

M3gA-Mind commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

M3gA-Mind commented Jun 3, 2026

Summary

Problem

Solution

Submission Checklist

Impact

Related

Uh oh!

M3gA-Mind commented Jun 4, 2026

Superseded — split into 7 small, dependency-ordered PRs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant