Skip to content

fix(models): retry transient Gemini errors (504/503/500) and extend request timeout#285

Open
AdvancedUno wants to merge 1 commit into
interviewstreet:mainfrom
AdvancedUno:fix/gemini-transient-error-retries
Open

fix(models): retry transient Gemini errors (504/503/500) and extend request timeout#285
AdvancedUno wants to merge 1 commit into
interviewstreet:mainfrom
AdvancedUno:fix/gemini-transient-error-retries

Conversation

@AdvancedUno

@AdvancedUno AdvancedUno commented Jun 26, 2026

Copy link
Copy Markdown

Summary

Fixes the Gemini provider aborting resume extraction on transient 504 Deadline expired errors.

Two changes in GeminiProvider.chat() (models.py):

  1. Extend the per-request deadline. Pass request_options={"timeout": 600} to generate_content(). The SDK's default per-request gRPC deadline (~60s) is too short for larger section calls on gemini-2.5-flash / gemini-3.5-flash, which raised DeadlineExceeded (504).
  2. Retry transient server errors. The backoff loop previously only handled ResourceExhausted (429). It now also retries DeadlineExceeded (504), ServiceUnavailable (503), and InternalServerError (500) using the same exponential-backoff-with-jitter logic.

Previously, a single transient 504 on any section caused _extract_all_sections_separately to abort the whole run with no output.

Before / After

Before — one transient 504 aborts the entire extraction:

 Error calling LLM for projects section: 504 Deadline expired before operation could complete.
 Failed to extract projects section. Aborting extraction to prevent partial/invalid resume data.

After — transient errors are retried with backoff and the run completes:

[GeminiProvider] Transient error DeadlineExceeded (attempt 1/5). Retrying in 11.6s...
Total time for separate section extraction: 23.06 seconds
OVERALL SCORE: 75.0/100

Testing

  • Ran python score.py <resume>.pdf end to end with LLM_PROVIDER=gemini, DEFAULT_MODEL=gemini-2.5-flash.
  • Before: extraction aborted with 504 Deadline expired before operation could complete. on the projects / awards sections (reproduced across multiple runs).
  • After: all six sections extract successfully, GitHub enrichment and evaluation complete, and a full report is produced.

Notes

  • No prompt changes; provider-agnostic behavior preserved.
  • Matched the surrounding code style (existing 2 ** attempt idiom, same retry-loop structure) rather than reformatting unrelated lines, to keep the diff focused on the fix.

Fixes #284

generate_content() was called without a request timeout, so the SDK's
default ~60s per-request deadline applied and larger section calls raised
504 DeadlineExceeded. The retry loop only handled 429 ResourceExhausted,
so transient 504/503/500 errors propagated and aborted the entire
extraction in _extract_all_sections_separately.

- Pass request_options={"timeout": 600} to extend the per-request deadline
- Retry DeadlineExceeded (504), ServiceUnavailable (503) and
  InternalServerError (500) with the same exponential backoff used for 429

Fixes interviewstreet#284

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Gemini provider aborts extraction on transient 504 DeadlineExceeded; only 429 is retried

1 participant