Summary
Request: add a Z.ai (GLM) LLM provider as a first-class backend, alongside the existing Ollama and Gemini providers, so users can run the resume-scoring pipeline with LLM_PROVIDER=zai and GLM models.
Why
The codebase already has a clean provider abstraction (ModelProvider enum + LLMProvider protocol), making it cheap to add backends. Z.ai exposes an OpenAI-compatible chat-completions endpoint, so a provider can be added with no new dependencies — it reuses the already-present requests. GLM models are a strong fit for structured resume evaluation:
glm-5.2 — a reasoning model (response carries reasoning_content + content); strong at structured/JSON output.
glm-4.6 — a faster, non-reasoning fallback on the same endpoint.
This gives users one more hosted option without needing a local Ollama daemon or a Google API key.
Proposal
- Endpoint:
POST https://api.z.ai/api/coding/paas/v4/chat/completions with Authorization: Bearer $Z_AI_API_KEY and the standard OpenAI request shape {model, messages, temperature, top_p, max_tokens}.
- Parse
choices[0].message.content (ignore reasoning_content); set a generous default max_tokens so reasoning doesn't starve the output; retry 429/5xx with backoff.
- Wire it through
ModelProvider.ZAI + a ZaiProvider, glm-5.2/glm-4.6 in prompt.py, a ZAI branch in llm_utils.initialize_llm_provider, and docs in .env.example / README.md.
- Purely additive — no changes to Ollama/Gemini providers.
Implementation
👉 Fix in PR #294 — #294
The PR is fully implemented, black-clean, verified end-to-end (real score.py run through glm-5.2 producing a valid EvaluationData), and introduces no new dependencies. Happy to adjust anything to match the project's conventions.
Summary
Request: add a Z.ai (GLM) LLM provider as a first-class backend, alongside the existing Ollama and Gemini providers, so users can run the resume-scoring pipeline with
LLM_PROVIDER=zaiand GLM models.Why
The codebase already has a clean provider abstraction (
ModelProviderenum +LLMProviderprotocol), making it cheap to add backends. Z.ai exposes an OpenAI-compatible chat-completions endpoint, so a provider can be added with no new dependencies — it reuses the already-presentrequests. GLM models are a strong fit for structured resume evaluation:glm-5.2— a reasoning model (response carriesreasoning_content+content); strong at structured/JSON output.glm-4.6— a faster, non-reasoning fallback on the same endpoint.This gives users one more hosted option without needing a local Ollama daemon or a Google API key.
Proposal
POST https://api.z.ai/api/coding/paas/v4/chat/completionswithAuthorization: Bearer $Z_AI_API_KEYand the standard OpenAI request shape{model, messages, temperature, top_p, max_tokens}.choices[0].message.content(ignorereasoning_content); set a generous defaultmax_tokensso reasoning doesn't starve the output; retry 429/5xx with backoff.ModelProvider.ZAI+ aZaiProvider,glm-5.2/glm-4.6inprompt.py, aZAIbranch inllm_utils.initialize_llm_provider, and docs in.env.example/README.md.Implementation
👉 Fix in PR #294 — #294
The PR is fully implemented,
black-clean, verified end-to-end (realscore.pyrun throughglm-5.2producing a validEvaluationData), and introduces no new dependencies. Happy to adjust anything to match the project's conventions.