Skip to content

Feature/signed local bridge#4

Open
rossshannon wants to merge 19 commits into
mainfrom
feature/signed-local-bridge
Open

Feature/signed local bridge#4
rossshannon wants to merge 19 commits into
mainfrom
feature/signed-local-bridge

Conversation

@rossshannon
Copy link
Copy Markdown
Owner

No description provided.

rossshannon added 15 commits May 1, 2026 19:46
Routes the four MCP write tools (add_task, add_new_project, update_task,
update_existing_project) through the signed bridge so they share the
bundle's durable TCC identity instead of inheriting whichever transient
runtime hosts the MCP server. Also lands the operational fixes that made
the bridge actually function on macOS 26 Tahoe: AssociatedBundleIdentifiers
in the LaunchAgent plist, LSUIElement instead of LSBackgroundOnly,
NSAppleEventsUsageDescription, THINGS3_MCP_NO_DISCLAIM=1 (the disclaim
API returns EINVAL on Tahoe — see _disclaim.py for the dead-end attempt),
and bootout/bootstrap install. 9 new write-path unit tests, 20/20 pass.

Full debugging trail in docs/sessions/2026-05-05-0834-signed-local-bridge-working.md.
Lifts get_logbook, get_trash, get_todos, get_random_todos, get_tagged_items,
search_advanced, search_all_items, and get_recent off direct things-py
calls and onto the provider abstraction so they share the bridge's durable
TCC identity instead of inheriting whichever transient runtime hosts the
MCP server.

Adds trash() and last(period) to the ThingsProvider protocol, with bridge
endpoints (GET /things/trash, GET /things/last/{period}), worker dispatch
in run_sqlite_action, and provider implementations across BridgeThings,
DirectThings, and CacheThings (cache returns empty since the snapshot
doesn't materialise these views). 16 new unit tests, 36/36 pass.

After this commit, all 14 read tools and 4 write tools in fast_server flow
through the provider chain. The bridge is now the durable single identity
for every TCC-protected Things operation.
Captures the full arc: write API, remaining-reads migration, dead-ends
(spctl red herring, broken disclaim API), reboot-as-fix, and final state
of all 23 MCP tools routed through the bridge provider chain.
…fix auth/threading/routing

Three reviewers flagged a regression I'd introduced in the previous commit
(P1: writes were bridge-only by default, breaking installs without the bridge
built/authorised). Plus several real correctness/security issues that the
unit tests had missed.

Provider chain (Codex P1, Claude C1):
- AutoThingsProvider gains a separate write_providers chain so writes always
  fall back to direct AppleScript when the bridge is unavailable. Bridge stays
  optional and additive, not a breaking replacement.
- Three direct things.get() calls in fast_server (post-create location lookup
  for add_task / add_new_project, and show_item) are now routed through the
  provider via _resolve_todo_location and _resolve_project_location. They were
  silently bypassing the bridge for the post-create UX path, returning
  "in Unknown" on any host without FDA.
- CacheThingsProvider.trash() and last() now raise cache_miss instead of
  returning [], so the auto-provider falls through to direct rather than
  treating an empty list as a successful hit (Codex P2).

Bridge HTTP server (Claude C2/C3, H1, H4, H5, S9):
- Bearer token comparison uses hmac.compare_digest (constant time) and fails
  closed on empty tokens, fixing the "empty token authorizes empty header"
  bug.
- Token file is created atomically with os.open(..., O_EXCL | O_WRONLY, 0o600)
  rather than write_text-then-chmod, closing the brief race where another
  same-UID process could read the freshly-written token.
- UnixHTTPServer mixes in ThreadingMixIn so a hung worker doesn't block
  /health or other concurrent requests.
- live_or_cache annotates the response with cache_error when both live and
  cache fail, so users see both failures at once.
- _disclaim.py removed entirely. responsibility_spawnattrs_setdisclaim is
  broken on macOS Tahoe (verified) and the disclaim path was a dead-end the
  bridge no longer uses; keeping it gated off was just cargo. Session note
  retains the debugging trail in case Apple ever fixes the API.

Worker discovery (Gemini HIGH):
- _enumerate_db_patterns sorts by mtime so the most recently active
  ThingsData-* hash folder is selected. ThingsData-* names are random
  alphanumeric — lexicographic sort doesn't reflect recency.

Build (Codex P2):
- build_bridge_app.sh creates BUILD_DIR before writing the entrypoint script,
  so first-time setup works on a clean checkout.

Tests:
- 8 new unit tests covering write fallback, cache miss, constant-time auth,
  empty-token rejection, missing-header rejection, atomic token creation,
  live+cache combined errors, and the default write chain factory.
- Updated test_search_advanced_empty_results to keep the strict
  "Error in advanced search:" prefix contract — the production code now
  preserves it whether the error came via the bridge or directly.
- 6 list-operation tests now use a split_top_level_items helper so user notes
  containing markdown horizontal rules don't break the format-verification
  splits (pre-existing fragility, not a routing regression).
- test_create_tag_independently retries the AppleScript cleanup once before
  asserting (the cleanup AppleScript can time out on large databases).
- test_success_logging_includes_location now asserts on the MCP tool's return
  value rather than in-process logs from applescript_bridge (which now run
  inside the worker subprocess and are unobservable from the test process).

44 unit tests pass, ruff/format/mypy/bandit all clean.
End-to-end smoke tests pass: snapshot returns live data, unauthenticated
requests get 401, /health responds in 19ms while a concurrent /snapshot is in
flight (threading works).
The Things database belongs to Things — we never write to it directly.
All mutations go through AppleScript so Things owns its sync engine. Codify
the three layers of protection (mode=ro URI flag, things-py read-only API,
AppleScript-only mutation path) where users can find it.
Documents and accommodates a real architectural seam discovered while making
the integration suite green: when the bridge is healthy and the cache has any
snapshot, an unknown-tag ValueError from the live worker triggers a cache
fallback that iterates the snapshot and returns []. The user sees "No items
found matching your search criteria" — looks like "no matches" rather than a
validation error. The original ValueError survives in the response envelope's
live_error field, but the MCP-tool layer only forwards data.

The test exists to verify tag-spelling mistakes surface usefully to callers,
which is true on the direct path. Pin THINGS3_MCP_PROVIDER=direct for this
specific test so we exercise the strict path; the bridge-fallback tradeoff is
documented in the test docstring.

Background context: a future improvement would be to distinguish validation
errors from transient errors in the worker (e.g. raise things_validation_error
vs things_db_unreadable) and skip cache fallback for the former. Out of scope
for this PR.
Audit results: the pattern affects two production tools.

- search_advanced (status/tag/start/type kwargs) — already addressed by
  pinning the test to direct provider mode.
- get_tagged_items (tag kwarg) — same shape: typo'd tag silently returns
  "No items found with tag 'travl'" instead of "Unrecognized tag — did you
  mean 'travel'?". Documented as a known limitation.

Also mapped which things-py kwargs raise vs return empty (tag/status/start/
type/last-period raise ValueError; project-uuid/area-uuid/get/search return
empty), so future contributors can reason about whether a new pre-validation
guard is needed before adding a new MCP tool.

Mitigation options listed in priority order; the worker-error-code
distinction (things_validation_error vs things_db_unreadable) is the
cleanest single fix and is added to the "next steps" list.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2242c2a426

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +132 to +135
cached = _cache_envelope(action, params)
if cached.get("ok"):
cached["live_error"] = {key: live.get(key) for key in ("error_code", "message", "authorization_hint") if live.get(key)}
return cached
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve live validation errors instead of masking with cache

This unconditional cache fallback turns live query errors into apparent empty results, which changes user-visible semantics for invalid filters. For example, an invalid tag in /things/todos can fail in the live worker but still return ok from cache as [], so callers see “no matches” instead of a validation error. The docstring says fallback is for authorization/read availability failures, but this branch applies to every live failure and can silently hide bad inputs.

Useful? React with 👍 / 👎.

Comment on lines +139 to +142
except ProviderError as exc:
if first_error is None:
first_error = exc
continue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid retrying non-idempotent writes after ambiguous bridge errors

Catching every ProviderError and immediately falling through to the next write provider can duplicate creates. If the bridge times out or disconnects after committing add_task/add_project, this loop retries via direct AppleScript and creates a second item because the first attempt’s outcome is unknown. The current logic treats transport/timeout failures the same as pre-dispatch unavailability, which is unsafe for non-idempotent mutations.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant