Skip to content

Refresh IAM token lazily so it survives a long process pause#179

Open
sergeynenashev wants to merge 1 commit into
yandex-cloud:masterfrom
sergeynenashev:iam-lazy-refresh-on-suspend
Open

Refresh IAM token lazily so it survives a long process pause#179
sergeynenashev wants to merge 1 commit into
yandex-cloud:masterfrom
sergeynenashev:iam-lazy-refresh-on-suspend

Conversation

@sergeynenashev
Copy link
Copy Markdown

The IAM bearer token was refreshed only by a background time.AfterFunc. That timer's deadline is measured against the monotonic clock. If the process is paused for a long time (e.g. a suspended or migrated VM), the monotonic clock does not advance during the pause while wall-clock time does, so the timer can fire too late and the token may already be expired by wall-clock time when execution resumes. S3 then rejects the request with 403.

Additionally store the token deadline as a wall-clock value (time.Now().UnixNano(), which has no monotonic component) and lazily re-acquire the token from the request signer when that deadline is near. The signer runs on the hot path after the process resumes, so it is a best-effort trigger independent of the possibly-late timer. The background timer remains the primary refresh path; the lazy path uses a smaller margin and is only a safety net. Both refresh paths are serialized through a mutex, and the wall-clock deadline is read via an atomic on the lock-free fast path.

The IAM bearer token was refreshed only by a background time.AfterFunc.
That timer's deadline is measured against the monotonic clock. If the
process is paused for a long time (e.g. a suspended or migrated VM), the
monotonic clock does not advance during the pause while wall-clock time
does, so the timer can fire too late and the token may already be
expired by wall-clock time when execution resumes. S3 then rejects the
request with 403.

Additionally store the token deadline as a wall-clock value
(time.Now().UnixNano(), which has no monotonic component) and lazily
re-acquire the token from the request signer when that deadline is near.
The signer runs on the hot path after the process resumes, so it is a
best-effort trigger independent of the possibly-late timer. The
background timer remains the primary refresh path; the lazy path uses a
smaller margin and is only a safety net. Both refresh paths are
serialized through a mutex, and the wall-clock deadline is read via an
atomic on the lock-free fast path.
@sergeynenashev sergeynenashev requested a review from mishik May 18, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant