Feature/signed local bridge#4
Conversation
Routes the four MCP write tools (add_task, add_new_project, update_task, update_existing_project) through the signed bridge so they share the bundle's durable TCC identity instead of inheriting whichever transient runtime hosts the MCP server. Also lands the operational fixes that made the bridge actually function on macOS 26 Tahoe: AssociatedBundleIdentifiers in the LaunchAgent plist, LSUIElement instead of LSBackgroundOnly, NSAppleEventsUsageDescription, THINGS3_MCP_NO_DISCLAIM=1 (the disclaim API returns EINVAL on Tahoe — see _disclaim.py for the dead-end attempt), and bootout/bootstrap install. 9 new write-path unit tests, 20/20 pass. Full debugging trail in docs/sessions/2026-05-05-0834-signed-local-bridge-working.md.
Lifts get_logbook, get_trash, get_todos, get_random_todos, get_tagged_items,
search_advanced, search_all_items, and get_recent off direct things-py
calls and onto the provider abstraction so they share the bridge's durable
TCC identity instead of inheriting whichever transient runtime hosts the
MCP server.
Adds trash() and last(period) to the ThingsProvider protocol, with bridge
endpoints (GET /things/trash, GET /things/last/{period}), worker dispatch
in run_sqlite_action, and provider implementations across BridgeThings,
DirectThings, and CacheThings (cache returns empty since the snapshot
doesn't materialise these views). 16 new unit tests, 36/36 pass.
After this commit, all 14 read tools and 4 write tools in fast_server flow
through the provider chain. The bridge is now the durable single identity
for every TCC-protected Things operation.
Captures the full arc: write API, remaining-reads migration, dead-ends (spctl red herring, broken disclaim API), reboot-as-fix, and final state of all 23 MCP tools routed through the bridge provider chain.
…fix auth/threading/routing Three reviewers flagged a regression I'd introduced in the previous commit (P1: writes were bridge-only by default, breaking installs without the bridge built/authorised). Plus several real correctness/security issues that the unit tests had missed. Provider chain (Codex P1, Claude C1): - AutoThingsProvider gains a separate write_providers chain so writes always fall back to direct AppleScript when the bridge is unavailable. Bridge stays optional and additive, not a breaking replacement. - Three direct things.get() calls in fast_server (post-create location lookup for add_task / add_new_project, and show_item) are now routed through the provider via _resolve_todo_location and _resolve_project_location. They were silently bypassing the bridge for the post-create UX path, returning "in Unknown" on any host without FDA. - CacheThingsProvider.trash() and last() now raise cache_miss instead of returning [], so the auto-provider falls through to direct rather than treating an empty list as a successful hit (Codex P2). Bridge HTTP server (Claude C2/C3, H1, H4, H5, S9): - Bearer token comparison uses hmac.compare_digest (constant time) and fails closed on empty tokens, fixing the "empty token authorizes empty header" bug. - Token file is created atomically with os.open(..., O_EXCL | O_WRONLY, 0o600) rather than write_text-then-chmod, closing the brief race where another same-UID process could read the freshly-written token. - UnixHTTPServer mixes in ThreadingMixIn so a hung worker doesn't block /health or other concurrent requests. - live_or_cache annotates the response with cache_error when both live and cache fail, so users see both failures at once. - _disclaim.py removed entirely. responsibility_spawnattrs_setdisclaim is broken on macOS Tahoe (verified) and the disclaim path was a dead-end the bridge no longer uses; keeping it gated off was just cargo. Session note retains the debugging trail in case Apple ever fixes the API. Worker discovery (Gemini HIGH): - _enumerate_db_patterns sorts by mtime so the most recently active ThingsData-* hash folder is selected. ThingsData-* names are random alphanumeric — lexicographic sort doesn't reflect recency. Build (Codex P2): - build_bridge_app.sh creates BUILD_DIR before writing the entrypoint script, so first-time setup works on a clean checkout. Tests: - 8 new unit tests covering write fallback, cache miss, constant-time auth, empty-token rejection, missing-header rejection, atomic token creation, live+cache combined errors, and the default write chain factory. - Updated test_search_advanced_empty_results to keep the strict "Error in advanced search:" prefix contract — the production code now preserves it whether the error came via the bridge or directly. - 6 list-operation tests now use a split_top_level_items helper so user notes containing markdown horizontal rules don't break the format-verification splits (pre-existing fragility, not a routing regression). - test_create_tag_independently retries the AppleScript cleanup once before asserting (the cleanup AppleScript can time out on large databases). - test_success_logging_includes_location now asserts on the MCP tool's return value rather than in-process logs from applescript_bridge (which now run inside the worker subprocess and are unobservable from the test process). 44 unit tests pass, ruff/format/mypy/bandit all clean. End-to-end smoke tests pass: snapshot returns live data, unauthenticated requests get 401, /health responds in 19ms while a concurrent /snapshot is in flight (threading works).
The Things database belongs to Things — we never write to it directly. All mutations go through AppleScript so Things owns its sync engine. Codify the three layers of protection (mode=ro URI flag, things-py read-only API, AppleScript-only mutation path) where users can find it.
Documents and accommodates a real architectural seam discovered while making the integration suite green: when the bridge is healthy and the cache has any snapshot, an unknown-tag ValueError from the live worker triggers a cache fallback that iterates the snapshot and returns []. The user sees "No items found matching your search criteria" — looks like "no matches" rather than a validation error. The original ValueError survives in the response envelope's live_error field, but the MCP-tool layer only forwards data. The test exists to verify tag-spelling mistakes surface usefully to callers, which is true on the direct path. Pin THINGS3_MCP_PROVIDER=direct for this specific test so we exercise the strict path; the bridge-fallback tradeoff is documented in the test docstring. Background context: a future improvement would be to distinguish validation errors from transient errors in the worker (e.g. raise things_validation_error vs things_db_unreadable) and skip cache fallback for the former. Out of scope for this PR.
Audit results: the pattern affects two production tools. - search_advanced (status/tag/start/type kwargs) — already addressed by pinning the test to direct provider mode. - get_tagged_items (tag kwarg) — same shape: typo'd tag silently returns "No items found with tag 'travl'" instead of "Unrecognized tag — did you mean 'travel'?". Documented as a known limitation. Also mapped which things-py kwargs raise vs return empty (tag/status/start/ type/last-period raise ValueError; project-uuid/area-uuid/get/search return empty), so future contributors can reason about whether a new pre-validation guard is needed before adding a new MCP tool. Mitigation options listed in priority order; the worker-error-code distinction (things_validation_error vs things_db_unreadable) is the cleanest single fix and is added to the "next steps" list.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2242c2a426
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| cached = _cache_envelope(action, params) | ||
| if cached.get("ok"): | ||
| cached["live_error"] = {key: live.get(key) for key in ("error_code", "message", "authorization_hint") if live.get(key)} | ||
| return cached |
There was a problem hiding this comment.
Preserve live validation errors instead of masking with cache
This unconditional cache fallback turns live query errors into apparent empty results, which changes user-visible semantics for invalid filters. For example, an invalid tag in /things/todos can fail in the live worker but still return ok from cache as [], so callers see “no matches” instead of a validation error. The docstring says fallback is for authorization/read availability failures, but this branch applies to every live failure and can silently hide bad inputs.
Useful? React with 👍 / 👎.
| except ProviderError as exc: | ||
| if first_error is None: | ||
| first_error = exc | ||
| continue |
There was a problem hiding this comment.
Avoid retrying non-idempotent writes after ambiguous bridge errors
Catching every ProviderError and immediately falling through to the next write provider can duplicate creates. If the bridge times out or disconnects after committing add_task/add_project, this loop retries via direct AppleScript and creates a second item because the first attempt’s outcome is unknown. The current logic treats transport/timeout failures the same as pre-dispatch unavailability, which is unsafe for non-idempotent mutations.
Useful? React with 👍 / 👎.
No description provided.