Description
When using the Gemini provider (LLM_PROVIDER=gemini), resume extraction intermittently aborts with:
504 Deadline expired before operation could complete.
The failure happens on individual section calls (e.g. projects, awards). Because _extract_all_sections_separately aborts the entire extraction if any single section fails, one transient 504 kills the whole run and no resume is produced:
❌ Error calling LLM for projects section: 504 Deadline expired before operation could complete.
⚠️ Failed to extract projects section. Aborting extraction to prevent partial/invalid resume data.
Root cause
In GeminiProvider.chat() (models.py):
gemini_model.generate_content(...) is called without a request_options timeout, so the SDK's default per-request gRPC deadline (~60s) applies. Larger sections / slower models (e.g. gemini-2.5-flash, gemini-3.5-flash) routinely exceed this and raise google.api_core.exceptions.DeadlineExceeded (504).
- The retry loop only catches
ResourceExhausted (429). Transient server-side errors — DeadlineExceeded (504), ServiceUnavailable (503), InternalServerError (500) — are not retried and propagate up, aborting extraction.
This is distinct from #186, which concerns the 429 free-tier RPM limit (already handled by the existing backoff). The 504 timeout path has no retry and no extended deadline.
Steps to reproduce
- Set
LLM_PROVIDER=gemini, DEFAULT_MODEL=gemini-2.5-flash (or a 3.x flash model), and a valid GEMINI_API_KEY.
- Run
python score.py <resume>.pdf on a resume with several populated sections.
- Intermittently a section call fails with
504 Deadline expired before operation could complete. and the run aborts before producing an evaluation.
Environment
- OS: Windows 11
- Python: 3.11.7
google-generativeai: 0.4.0 (as pinned in requirements.txt)
- Models observed:
gemini-2.5-flash, gemini-3.5-flash
Expected behavior
Transient 504 / 503 / 500 errors should be retried with backoff (as 429 already is), and the per-request timeout should be large enough to accommodate normal section calls, so a single transient hiccup doesn't abort the whole extraction.
Proposed fix
- Pass
request_options={"timeout": ...} to generate_content to extend the per-request deadline.
- Extend the existing exponential-backoff loop to also retry
DeadlineExceeded, ServiceUnavailable, and InternalServerError.
I have a fix ready and will open a PR referencing this issue.
Description
When using the Gemini provider (
LLM_PROVIDER=gemini), resume extraction intermittently aborts with:The failure happens on individual section calls (e.g.
projects,awards). Because_extract_all_sections_separatelyaborts the entire extraction if any single section fails, one transient 504 kills the whole run and no resume is produced:Root cause
In
GeminiProvider.chat()(models.py):gemini_model.generate_content(...)is called without arequest_optionstimeout, so the SDK's default per-request gRPC deadline (~60s) applies. Larger sections / slower models (e.g.gemini-2.5-flash,gemini-3.5-flash) routinely exceed this and raisegoogle.api_core.exceptions.DeadlineExceeded(504).ResourceExhausted(429). Transient server-side errors —DeadlineExceeded(504),ServiceUnavailable(503),InternalServerError(500) — are not retried and propagate up, aborting extraction.This is distinct from #186, which concerns the 429 free-tier RPM limit (already handled by the existing backoff). The 504 timeout path has no retry and no extended deadline.
Steps to reproduce
LLM_PROVIDER=gemini,DEFAULT_MODEL=gemini-2.5-flash(or a 3.x flash model), and a validGEMINI_API_KEY.python score.py <resume>.pdfon a resume with several populated sections.504 Deadline expired before operation could complete.and the run aborts before producing an evaluation.Environment
google-generativeai: 0.4.0 (as pinned inrequirements.txt)gemini-2.5-flash,gemini-3.5-flashExpected behavior
Transient
504/503/500errors should be retried with backoff (as429already is), and the per-request timeout should be large enough to accommodate normal section calls, so a single transient hiccup doesn't abort the whole extraction.Proposed fix
request_options={"timeout": ...}togenerate_contentto extend the per-request deadline.DeadlineExceeded,ServiceUnavailable, andInternalServerError.I have a fix ready and will open a PR referencing this issue.