Source
Sentry: https://sentry.tinyhumans.ai/organizations/tinyhumans/issues/10246/
Short ID: TAURI-RUST-92J (project tauri-rust)
Events: 10,158 · Users affected: 1 · First seen: 2026-06-03 04:50 UTC · Last seen: 2026-06-04 10:31 UTC
Reproducing release: openhuman@0.56.0+e8968077aeb5
Platform: Windows 10.0.26200 (Windows 11 24H2) · x86_64
Symptom
Failed to replace auth profile store at C:\Users\<user>\.openhuman\users\<uid>\auth-profiles.json
Captured by report_error_or_expected from the JSON-RPC error path during openhuman.app_state_snapshot (domain rpc, operation invoke_method, elapsed_ms ≈ 3). Message-only Sentry event (no stack).
Where it fails
src/openhuman/credentials/profiles.rs:931-943 (function write_persisted_locked):
fs::write(&tmp_path, &json).with_context(|| {
format!("Failed to write temporary auth profile file at {}", tmp_path.display())
})?;
fs::rename(&tmp_path, &self.path).with_context(|| {
format!("Failed to replace auth profile store at {}", self.path.display())
})?;
Neither fs::write nor fs::rename is wrapped in crate::openhuman::util::retry_with_backoff, which is the helper that handles the exact Windows transient FS-error family (is_transient_fs_error already recognises ERROR_ACCESS_DENIED (5), ERROR_SHARING_VIOLATION (32), ERROR_LOCK_VIOLATION (33), ERROR_DELETE_PENDING (303), ERROR_USER_MAPPED_FILE (1224) — see src/openhuman/util.rs:615).
The same helper IS used for the sibling .lock create at profiles.rs:987 (Sentry OPENHUMAN-TAURI-H1 / H8 fix, PRs #2085 / #1641). The .json rename path was left out — partial fix.
Why the event count is 10k+ in 24h
load_locked runs on every app_state_snapshot poll. When a profile is dropped (decrypt failure, unrecognized kind, or — pre-#3125 — OAuth missing access_token), load_locked calls write_persisted_locked at profiles.rs:744 to persist the purge. If the rename fails, the on-disk state is unchanged, so the next app_state_snapshot poll re-drops the same profile, re-attempts the same write, and re-fails. Tight loop until the file handle is released — and on Windows, AV / Search-Indexer / Defender can hold a file handle for many seconds.
The frontend health-check polls app_state_snapshot rapidly, so a single sustained AV hold amplifies into thousands of Sentry events.
Reproduces on
Bug shape
Windows transient FS-race on fs::rename. Same family as the lock-create races already retried in PR #1641 / #2085 / #2180. Generic classifier (is_transient_fs_error) already in place; the call site here just isn't routed through it.
Fix scope
- Route
fs::write(&tmp_path, &json) and fs::rename(&tmp_path, &self.path) through retry_with_backoff("...", 6, 100, …), matching the parameters used by the .lock create at profiles.rs:987.
- Persisted-write amplification guard: when
write_persisted_locked exhausts retries during a load_locked purge, log + tag the error path so subsequent rapid app_state_snapshot polls don't replay the same write-and-fail loop until the AV handle is released. Either short-cache a "purge already attempted this session" flag, or surface the rename failure once and return the in-memory purged state without persisting. Either route defuses the 10k-event-per-day amplification.
- Add a Rust regression test using the
__TEST_TRANSIENT__ sentinel is_transient_fs_error already understands (src/openhuman/util.rs:618) to verify the rename path retries.
Sentry-Issue: TAURI-RUST-92J
Source
Sentry: https://sentry.tinyhumans.ai/organizations/tinyhumans/issues/10246/
Short ID:
TAURI-RUST-92J(projecttauri-rust)Events: 10,158 · Users affected: 1 · First seen: 2026-06-03 04:50 UTC · Last seen: 2026-06-04 10:31 UTC
Reproducing release:
openhuman@0.56.0+e8968077aeb5Platform: Windows 10.0.26200 (Windows 11 24H2) · x86_64
Symptom
Captured by
report_error_or_expectedfrom the JSON-RPC error path duringopenhuman.app_state_snapshot(domainrpc, operationinvoke_method,elapsed_ms ≈ 3). Message-only Sentry event (no stack).Where it fails
src/openhuman/credentials/profiles.rs:931-943(functionwrite_persisted_locked):Neither
fs::writenorfs::renameis wrapped incrate::openhuman::util::retry_with_backoff, which is the helper that handles the exact Windows transient FS-error family (is_transient_fs_erroralready recognisesERROR_ACCESS_DENIED (5),ERROR_SHARING_VIOLATION (32),ERROR_LOCK_VIOLATION (33),ERROR_DELETE_PENDING (303),ERROR_USER_MAPPED_FILE (1224)— seesrc/openhuman/util.rs:615).The same helper IS used for the sibling
.lockcreate atprofiles.rs:987(Sentry OPENHUMAN-TAURI-H1 / H8 fix, PRs #2085 / #1641). The.jsonrename path was left out — partial fix.Why the event count is 10k+ in 24h
load_lockedruns on everyapp_state_snapshotpoll. When a profile is dropped (decrypt failure, unrecognizedkind, or — pre-#3125 — OAuth missingaccess_token),load_lockedcallswrite_persisted_lockedatprofiles.rs:744to persist the purge. If the rename fails, the on-disk state is unchanged, so the nextapp_state_snapshotpoll re-drops the same profile, re-attempts the same write, and re-fails. Tight loop until the file handle is released — and on Windows, AV / Search-Indexer / Defender can hold a file handle for many seconds.The frontend health-check polls
app_state_snapshotrapidly, so a single sustained AV hold amplifies into thousands of Sentry events.Reproduces on
upstream/main@87a91ae02(v0.57.14)profiles.rscommits since 0.56.0 (fix(auth): gracefully drop OAuth profiles with missing access_token #3125, feat: prompt user consent when OS keyring is unavailable #3075) do not touch the rename retry path. Verified viagit log e8968077aeb5..upstream/main -- src/openhuman/credentials/profiles.rs.auth-profiles.json(e.g. viaGet-Content -Waitin PowerShell) while triggering anyapp_state_snapshotthat exercises a drop / migration branch.Bug shape
Windows transient FS-race on
fs::rename. Same family as the lock-create races already retried in PR #1641 / #2085 / #2180. Generic classifier (is_transient_fs_error) already in place; the call site here just isn't routed through it.Fix scope
fs::write(&tmp_path, &json)andfs::rename(&tmp_path, &self.path)throughretry_with_backoff("...", 6, 100, …), matching the parameters used by the.lockcreate atprofiles.rs:987.write_persisted_lockedexhausts retries during aload_lockedpurge, log + tag the error path so subsequent rapidapp_state_snapshotpolls don't replay the same write-and-fail loop until the AV handle is released. Either short-cache a "purge already attempted this session" flag, or surface the rename failure once and return the in-memory purged state without persisting. Either route defuses the 10k-event-per-day amplification.__TEST_TRANSIENT__sentinelis_transient_fs_erroralready understands (src/openhuman/util.rs:618) to verify the rename path retries.Sentry-Issue: TAURI-RUST-92J