fix(models): retry transient Gemini errors (504/503/500) and extend request timeout#285
Open
AdvancedUno wants to merge 1 commit into
Open
Conversation
generate_content() was called without a request timeout, so the SDK's
default ~60s per-request deadline applied and larger section calls raised
504 DeadlineExceeded. The retry loop only handled 429 ResourceExhausted,
so transient 504/503/500 errors propagated and aborted the entire
extraction in _extract_all_sections_separately.
- Pass request_options={"timeout": 600} to extend the per-request deadline
- Retry DeadlineExceeded (504), ServiceUnavailable (503) and
InternalServerError (500) with the same exponential backoff used for 429
Fixes interviewstreet#284
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the Gemini provider aborting resume extraction on transient
504 Deadline expirederrors.Two changes in
GeminiProvider.chat()(models.py):request_options={"timeout": 600}togenerate_content(). The SDK's default per-request gRPC deadline (~60s) is too short for larger section calls ongemini-2.5-flash/gemini-3.5-flash, which raisedDeadlineExceeded(504).ResourceExhausted(429). It now also retriesDeadlineExceeded(504),ServiceUnavailable(503), andInternalServerError(500) using the same exponential-backoff-with-jitter logic.Previously, a single transient 504 on any section caused
_extract_all_sections_separatelyto abort the whole run with no output.Before / After
Before — one transient 504 aborts the entire extraction:
After — transient errors are retried with backoff and the run completes:
Testing
python score.py <resume>.pdfend to end withLLM_PROVIDER=gemini,DEFAULT_MODEL=gemini-2.5-flash.504 Deadline expired before operation could complete.on theprojects/awardssections (reproduced across multiple runs).Notes
2 ** attemptidiom, same retry-loop structure) rather than reformatting unrelated lines, to keep the diff focused on the fix.Fixes #284