Problem
When using local Ollama models (e.g., gemma3:4b, qwen3:4b, mistral:7b), the resume section extractor in pdf.py
frequently fails to parse structured sections such as skills, projects, and awards. The _extract_all_sections_separately()
method returns None for one or more sections, causing the entire extraction to abort due to the fail-fast logic at line 297–299. The same resume parses perfectly when a Gemini model is used.
Proposed Solution
The issue is likely caused by smaller Ollama models not consistently adhering to the strict JSON-only output instructions defined in skills.jinja and system_message.jinja. Despite stripping <think> tokens in llm_utils.py, these models may still generate additional text, malformed JSON, or residual reasoning tokens, causing JSON parsing to fail.
Possible improvements include:
-
Enforce structured output more strictly
- The
format=model.model_json_schema() argument is already passed in _call_llm_for_section(), but its effectiveness varies across Ollama models. Investigate ways to enforce structured output more reliably.
-
Add per-section retry logic
- Instead of aborting on the first parsing failure, retry the LLM call for the affected section 2–3 times before marking it as failed.
-
Make fail-fast behavior configurable
- Allow partial resume extraction instead of returning
None for the entire resume when a single section fails. This would improve robustness and enable graceful degradation.
-
Use model-specific prompt variants
- Smaller Ollama models may perform better with shorter, more constrained prompts. Providing model-specific prompt templates could improve JSON compliance.
I am willing to submit a PR implementing per-section retry logic and/or improved prompt engineering for Ollama models.
Problem
When using local Ollama models (e.g., gemma3:4b, qwen3:4b, mistral:7b), the resume section extractor in pdf.py
frequently fails to parse structured sections such as skills, projects, and awards. The _extract_all_sections_separately()
method returns None for one or more sections, causing the entire extraction to abort due to the fail-fast logic at line 297–299. The same resume parses perfectly when a Gemini model is used.
Proposed Solution
The issue is likely caused by smaller Ollama models not consistently adhering to the strict JSON-only output instructions defined in
skills.jinjaandsystem_message.jinja. Despite stripping<think>tokens inllm_utils.py, these models may still generate additional text, malformed JSON, or residual reasoning tokens, causing JSON parsing to fail.Possible improvements include:
Enforce structured output more strictly
format=model.model_json_schema()argument is already passed in_call_llm_for_section(), but its effectiveness varies across Ollama models. Investigate ways to enforce structured output more reliably.Add per-section retry logic
Make fail-fast behavior configurable
Nonefor the entire resume when a single section fails. This would improve robustness and enable graceful degradation.Use model-specific prompt variants
I am willing to submit a PR implementing per-section retry logic and/or improved prompt engineering for Ollama models.