perf(providers): per-home probe-worker pool + scoped credential invalidation (#3787)#3940
perf(providers): per-home probe-worker pool + scoped credential invalidation (#3787)#3940rodboev wants to merge 5 commits into
Conversation
|
| Filename | Overview |
|---|---|
| api/providers.py | Pool expanded to list[Worker], lock-held handoff implemented, scoped flush added; stale-detection logic for null-proc workers is inconsistent with an existing test assertion. |
| tests/test_issue3787_probe_pool.py | New test suite covering pool sizing, non-blocking acquire, saturation fallback, scoped invalidation, concurrent no-double-claim, cleanup replenishment. |
| tests/test_provider_quota_status.py | Adapted to list-of-workers pool shape; test_account_usage_cleanup_removes_null_proc_worker retains an assertion that will fail under the new stale-detection logic. |
Sequence Diagram
sequenceDiagram
participant C as Caller
participant G as _get_account_usage_probe_worker
participant P as worker_pool dict
participant W0 as Worker[0]
participant W1 as Worker[1]
participant F as cold fallback
C->>G: call(home)
G->>P: acquire pool_lock, get workers list
alt pool missing
G->>P: create 2 new Workers, store list
end
G->>P: release pool_lock
G->>W0: "_lock.acquire(blocking=False)"
alt W0 free
W0-->>G: acquired
G-->>C: return W0 (lock held)
C->>W0: _fetch_locked()
W0-->>C: snapshot
C->>W0: _lock.release()
else W0 busy
G->>W1: "_lock.acquire(blocking=False)"
alt W1 free
W1-->>G: acquired
G-->>C: return W1 (lock held)
C->>W1: _fetch_locked()
W1-->>C: snapshot
C->>W1: _lock.release()
else pool saturated
G-->>C: return None
C->>F: _fetch_account_usage_once_for_home()
F-->>C: snapshot (cold subprocess)
end
end
Reviews (5): Last reviewed commit: "fix(providers): treat proc=None as idle,..." | Re-trigger Greptile
|
Read the full The selector reserves nothing — it acquires, releases, then returns
for w in workers:
if w._lock.acquire(blocking=False):
w._lock.release()
return w
if workers[0]._lock.acquire(timeout=_ACCOUNT_USAGE_WORKER_WAIT_SECONDS):
workers[0]._lock.release()
return workers[0]
return NoneThe caller then re-acquires in def fetch(self, provider, *, api_key=None):
if not self._lock.acquire(blocking=False):
return _fetch_account_usage_once_for_home(provider, self.home, api_key=api_key)Between the Suggested shape: hand back a held workerHave the selector acquire and keep the lock, returning the locked worker (or for w in workers:
if w._lock.acquire(blocking=False):
return w # caller releases
if workers[0]._lock.acquire(timeout=_ACCOUNT_USAGE_WORKER_WAIT_SECONDS):
return workers[0]
return Noneand in Test gap
Minor: scoped-flush vs cache asymmetryIn The cleanup refactor (per-worker stale collection + |
|
Followed up on the three commits pushed after my last review ( The TOCTOU handoff is fixed — thank you
def _get_account_usage_probe_worker(home: Path) -> "_AccountUsageProbeWorker | None":
...
for w in workers:
if w._lock.acquire(blocking=False):
return w # caller releases
return Noneand But
|
Thinking Path
What Changed
Why It Matters
Same-home contention under rapid Settings-panel polls no longer degrades to repeated cold subprocess spawns, reducing tail latency for quota probes; credential rotation for one provider stops evicting warm workers for unrelated providers.
Verification
Risks / Follow-ups
Model Used
Claude Opus 4.6 via Claude Code CLI