feat: add job description scoring mode with weighted evaluation and semantic matching by Kingsam147 · Pull Request #283 · interviewstreet/hiring-agent

Kingsam147 · 2026-06-26T03:44:14Z

Summary

The existing pipeline evaluates resumes against a fixed rubric hardcoded for a single role (Software Intern at HackerRank). This PR adds a second evaluation mode that accepts any job description and scores the resume against it using a 7-category weighted model, making the tool useful for any role.

At startup the user is prompted to choose between the two modes. The original HackerRank scoring is untouched.

How it works

Input: paste any job description into job_description.txt in the project root, place resume.pdf in the same directory, then run python score.py and select mode 2.

Both files ship as empty placeholders in the repo so new users know exactly where to put their inputs. To prevent personal content from ever being staged or pushed, run once after cloning:

git update-index --skip-worktree job_description.txt resume.pdf

Pipeline (mode 2):

LLM extracts structured requirements from the job description (job title, required skills, preferred skills, years of experience, education requirements, must-have qualifications)
Sentence Transformers (all-MiniLM-L6-v2) computes cosine similarity between the job description and resume embeddings for semantic/keyword matching
LLM scores six categories against the extracted requirements and full job description text
Weighted total is computed in Python

Scoring weights:

Category	Weight
Skills Match	30%
Experience Match	20%
Keyword & Semantic Match (Sentence Transformers)	15%
Job Title Alignment	10%
Education & Certifications	10%
Resume Quality	10%
Missing Critical Requirements	5%

Results are written to job_evaluations.csv in development mode (separate from the existing resume_evaluations.csv).

Changes

New files:

job_description.txt — empty placeholder for the job description input
resume.pdf — empty placeholder for the candidate resume
prompts/templates/job_description_extraction.jinja
prompts/templates/job_evaluation_criteria.jinja
prompts/templates/job_evaluation_system_message.jinja

Modified files:

models.py — JobDescriptionData, JobCategoryScore, JobScores, LLMJobEvaluationResponse, JobEvaluationData Pydantic models
evaluator.py — JobDescriptionEvaluator class
score.py — mode selector, fixed resume.pdf path, routing, new output formatter, job_evaluations.csv writing; resume cache now invalidated automatically when resume.pdf is replaced (mtime comparison)
transform.py — transform_job_evaluation_response(), removed stale pdb import
prompts/template_manager.py — registers three new templates
requirements.txt — adds sentence-transformers
.gitignore — adds job_evaluations.csv, package-lock.json; removes resume.pdf now that it is a tracked placeholder

Test plan

Validated with a real resume in mode 1 (original HackerRank scoring) — output unchanged from pre-PR behaviour
Validated with a real resume + job description in mode 2 using Ollama (gemma3:4b)
Validated with a real resume + job description in mode 2 using Gemini (gemini-2.5-flash)
Empty job_description.txt in mode 2 exits with a clear error message
Missing resume.pdf exits with a clear error message
job_evaluations.csv is created and appended to correctly in development mode
Personal resume and job description content cannot be accidentally staged or pushed (--skip-worktree)
Replacing resume.pdf with a different file triggers automatic cache invalidation on the next run

…emantic matching Adds a second evaluation mode alongside the original HackerRank scoring. When selected, the pipeline reads a job description from job_description.txt and scores the resume against it using a 7-category weighted model: - Skills Match (30%): LLM extracts required/preferred skills from the JD and checks the resume for each, weighting required skills at 80% - Experience Match (20%): LLM judges relevance of work history and projects - Keyword & Semantic Match (15%): Sentence Transformers (all-MiniLM-L6-v2) cosine similarity between JD and resume embeddings - Job Title Alignment (10%): LLM compares previous titles to the target role - Education & Certifications (10%): LLM checks degree and cert requirements - Resume Quality (10%): LLM grades action verbs and quantified achievements - Missing Critical Requirements (5%): penalises absent must-have qualifications At startup the user is prompted to choose between the two modes. Choosing mode 2 with an empty job_description.txt exits with a clear error message. Results are written to job_evaluations.csv in development mode. New files: - job_description.txt: empty placeholder for the job description input - prompts/templates/job_description_extraction.jinja - prompts/templates/job_evaluation_criteria.jinja - prompts/templates/job_evaluation_system_message.jinja Modified files: - models.py: JobDescriptionData, JobCategoryScore, JobScores, LLMJobEvaluationResponse, JobEvaluationData Pydantic models - evaluator.py: JobDescriptionEvaluator class - score.py: mode selector, fixed resume.pdf path, routing, output formatter - transform.py: transform_job_evaluation_response(), removed stale pdb import - prompts/template_manager.py: registers three new templates - requirements.txt: adds sentence-transformers - .gitignore: adds resume.pdf, job_evaluations.csv, package-lock.json

Keeps the empty placeholder in the repo for new users to clone but prevents personal job description content from being pushed.

….txt Both files are tracked as empty placeholders so new users know where to put their inputs. Use 'git update-index --skip-worktree' on each file to prevent personal content from being staged or pushed.

Compares modification timestamps so a replaced resume.pdf triggers a full re-extraction on the next run instead of serving stale cache.

Kingsam147 added 4 commits June 25, 2026 23:43

chore: untrack job_description.txt and add to .gitignore

11c7a9f

Keeps the empty placeholder in the repo for new users to clone but prevents personal job description content from being pushed.

chore: add empty placeholder files for resume.pdf and job_description…

1a89f86

….txt Both files are tracked as empty placeholders so new users know where to put their inputs. Use 'git update-index --skip-worktree' on each file to prevent personal content from being staged or pushed.

fix: invalidate resume cache when pdf is newer than cached data

492b72e

Compares modification timestamps so a replaced resume.pdf triggers a full re-extraction on the next run instead of serving stale cache.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add job description scoring mode with weighted evaluation and semantic matching#283

feat: add job description scoring mode with weighted evaluation and semantic matching#283
Kingsam147 wants to merge 4 commits into
interviewstreet:mainfrom
Kingsam147:feature/job-description-scoring

Kingsam147 commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Kingsam147 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Kingsam147 commented Jun 26, 2026 •

edited

Loading