JobQuest

Automated job application pipeline. Paste a URL, get a tailored PDF resume + ATS report + application answers + Notion tracking.

Quick Launch

Double-click JobQuest.command → Opens browser UI at http://127.0.0.1:7860

Or: python web_ui.py

The browser UI provides:

3 parallel application forms (run multiple jobs simultaneously)
Resume variant selector: Growth PM, Generalist, AI-PM (adjusts tagline, Q&A framing, AI context injection)
ATS provider selection for step 5 (Gemini, Groq, SambaNova — free tiers)
Writing steps (3, 6, 8) use the quality-first provider chain (Gemini → DeepSeek → OpenRouter)
Per-provider usage tracking
Real-time pipeline output

How It Works

Job URL
  │
  ├─ 1. Scrape job posting (Greenhouse/Lever/Ashby/Workable/Personio/Screenloop)
  ├─ 2. Read master resume from Notion
  ├─ 3. Tailor resume via LLM — three stages:
  │       3a. Analyse JD → structured tailoring brief (free-tier LLM)
  │       3b. Generate LaTeX from the brief (writing LLM)
  │       3c. Compliance check: brief vs LaTeX, log misses (free-tier LLM)
  ├─ 4. Write .tex file
  ├─ 5. Run ATS keyword coverage check (60-80% target)
  ├─ 6. Review & apply ATS edits
  ├─ 7. Compile PDF via pdflatex
  ├─ 8. Generate Q&A answers (company research + voice matching)
  └─ 9. Create Notion tracker entry
          │
          ▼
     Output: PDF + Q&A ready to copy/paste

LLM Architecture

Two separate provider tiers — one for quality, one for speed.

Writing steps (resume tailor, ATS edits, Q&A): Quality-first chain with automatic fallback.

Provider	Model	Get Key
DeepSeek V3.2	deepseek-chat	platform.deepseek.com
OpenRouter	Qwen3.5-397B	openrouter.ai
Anthropic	Haiku 4.5	console.anthropic.com

ATS check step (step 5): Free-tier providers with automatic fallback.

Provider	Daily Limit	Get Key
Gemini 3.1 Pro	~250 req	aistudio.google.com
Groq	1,000 req	console.groq.com
SambaNova	~500 req	cloud.sambanova.ai

CLI Usage

# Basic
python apply.py "https://jobs.lever.co/company/abc"

# With company URL (better research)
python apply.py "JOB_URL" --company-url "https://company.com"

# With application questions
python apply.py "JOB_URL" --questions "Why this role?" --questions "Cover letter"

# Select provider
python apply.py "JOB_URL" --provider groq

# Preview
python apply.py "JOB_URL" --dry-run
--dry-run prints the planned pipeline steps and does not execute the pipeline (no prompts, no file writes, no API calls).

Setup

# Install
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Configure .env (see .env.example)
WRITING_PROVIDER=gemini
GEMINI_API_KEY=...      # primary writing provider (also ATS check step, free)
DEEPSEEK_API_KEY=...    # writing fallback
GROQ_API_KEY=...        # optional ATS fallback
SAMBANOVA_API_KEY=...   # optional ATS fallback
OPENROUTER_API_KEY=...  # optional writing fallback
FIRECRAWL_API_KEY=...   # optional, better JS page scraping
NOTION_TOKEN=...
# ... (see .env.example for full list)

JobQuest loads environment variables from a local .env file at CLI startup.
At least one API key must be set in .env: GEMINI_API_KEY or GROQ_API_KEY or SAMBANOVA_API_KEY.

# Ensure pdflatex
brew install --cask mactex  # macOS

Output

output/CompanyName_YYYY-MM-DD/
  ├── tailoring_brief_*.md     # JD analysis used to drive step 3 (inspect if quality is off)
  ├── tailor_review_*.md       # Step 3c compliance check (HIGH = writing model missed the plan)
  ├── resume_tailored_*.tex    # LaTeX source
  ├── resume_tailored_*.pdf    # Ready to upload
  ├── ats_report_*.md          # Keyword coverage
  ├── qa_*.md                  # Answers to copy/paste
  └── pipeline_context.json    # Debug info

Supported Platforms

Platform	URL Pattern
Greenhouse	`boards.greenhouse.io`, `job-boards.eu.greenhouse.io`
Lever	`jobs.lever.co`
Ashby	`jobs.ashbyhq.com`
Workable	`apply.workable.com`
Personio	`.jobs.personio.de`, `.jobs.personio.com`
Screenloop	`app.screenloop.com`
Others	HTML scraping fallback

Project Structure

JobQuest/
├── web_ui.py              # Browser UI (Gradio)
├── apply.py               # CLI pipeline orchestrator
├── JobQuest.command       # Double-click launcher
├── COMMANDS.md            # Full command reference
│
├── modules/
│   ├── llm_client.py      # Multi-provider LLM (Gemini/Groq/SambaNova + fallback)
│   ├── job_scraper.py     # ATS APIs + HTML scraping
│   └── pipeline.py        # 9 pipeline steps
│
├── prompts/
│   ├── rodrigo-voice.md   # Shared voice/tone/banned phrases (injected into writing steps)
│   ├── jd_analysis.md     # Step 3a: JD analysis → tailoring brief (free-tier LLM)
│   ├── resume_tailor.md   # Step 3b: LaTeX generation from brief (writing LLM)
│   ├── tailor_review.md   # Step 3c: compliance check, brief vs LaTeX (free-tier LLM)
│   ├── ats_check.md       # ATS analysis prompt
│   └── qa_generator.md    # Q&A generation prompt
│
├── scripts/
│   ├── notion_tracker.py  # Notion integration
│   └── render_pdf.py      # LaTeX → PDF
│
└── templates/
    └── resume.tex         # Master LaTeX template

Key Principles

Human-in-the-loop: System generates materials, you submit
Honest materials: Only uses skills from pre-validated master resume
ATS-aware: Targets 60-80% keyword coverage, avoids stuffing

Technical Architecture

API Integrations

Service	Purpose	API Type
Notion	Master resume storage, application tracking	REST API
Greenhouse	Job scraping	Public JSON API
Lever	Job scraping	Postings API v0
Ashby	Job scraping	GraphQL API
Workable	Job scraping	Widget API
Personio	Job scraping	HTML + JSON extraction
Screenloop	Job scraping	HTML scraping
DuckDuckGo	Company research fallback	HTML scraping
Firecrawl	Enhanced web scraping (JS, anti-bot)	REST API

LLM Fallback Strategy

Writing steps (3, 6, 8) — quality-first:

Gemini 2.5 Flash (free, 1500 RPD)
    │
    └─ Any error? → DeepSeek V3.2
                         │
                         └─ Any error? → OpenRouter / Qwen3.5-397B
                                              │
                                              └─ Any error? → Groq → SambaNova

ATS check (step 5) — free-tier:

User-selected provider (Gemini / Groq / SambaNova)
    │
    ├─ Rate limit? → Try next model within provider
    │                  (Gemini: 3.1-pro → 3-flash → 2.5-pro → 2.5-flash → 2.5-flash-lite)
    │
    └─ All models exhausted? → Try next provider (Gemini → Groq → SambaNova)

    Gemini model order: 3.1-pro → 3-pro → 3-flash → 2.5-pro → 2.5-flash → 2.5-flash-lite

Prompt caching: Gemini and DeepSeek V3.2 both have automatic prefix caching. User prompt is ordered static-first (master resume, templates) then dynamic (job posting, questions), so the master resume is cached after the first application of the day.

Web Scraping Strategy

Job posting scraping:

Job URL
    │
    ├─ Known ATS? → Structured API (Greenhouse, Lever, Ashby, Workable, Personio, Screenloop)
    │
    └─ Unknown?  → HTML scraping
                     │
                     └─ JS-heavy? → Playwright (free, headless Chromium)
                                       │
                                       └─ Still thin? → Firecrawl

Company research scraping (used in step 8 for Q&A context):

Company URL provided?
    │
    ├─ Playwright first — discovers up to 5 pages via nav links
    │
    ├─ Thin result/SPA trap? → crawl4ai (handles JS routing)
    │
    └─ Still thin? → Firecrawl 
                         │
                         └─ Failed? → Plain HTML → Web search

Token Optimization

We considered several approaches to reduce Claude Code token usage:

Prompts in files (implemented): All LLM prompts stored in prompts/*.md, loaded at runtime
Templates over generation: LaTeX structure comes from master template, LLM only modifies content
Structured extraction: Job scraping uses APIs when available (cheaper than LLM parsing HTML)
Context7 MCP (configured): Provides up-to-date documentation to avoid outdated API calls

Pipeline vs Claude Code split:

Writing steps (resume tailoring, ATS edits, Q&A) → DeepSeek V3.2
ATS keyword check → Gemini/Groq/SambaNova (free)
System development → Claude Code (when modifying the codebase)

MCP Integration

.mcp.json configures Model Context Protocol servers:

{
  "mcpServers": {
    "context7": {
      "command": "npx",
      "args": ["-y", "@upstash/context7-mcp@latest"]
    }
  }
}

Context7 provides real-time documentation for APIs (Google Gemini, Notion, etc.) to avoid errors from outdated SDK usage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JobQuest

Quick Launch

How It Works

LLM Architecture

CLI Usage

Setup

Output

Supported Platforms

Project Structure

Key Principles

Technical Architecture

API Integrations

LLM Fallback Strategy

Web Scraping Strategy

Token Optimization

MCP Integration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
.agent/skills		.agent/skills
.claude		.claude
modules		modules
prompts		prompts
research		research
scripts		scripts
templates		templates
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
BACKLOG.md		BACKLOG.md
CLAUDE.md		CLAUDE.md
CLAUDE_CODE_TIPS.md		CLAUDE_CODE_TIPS.md
COMMANDS.md		COMMANDS.md
JobQuest.command		JobQuest.command
README.md		README.md
apply.py		apply.py
batch_job_search.py		batch_job_search.py
config.py		config.py
requirements.txt		requirements.txt
web_ui.py		web_ui.py

Folders and files

Latest commit

History

Repository files navigation

JobQuest

Quick Launch

How It Works

LLM Architecture

CLI Usage

Setup

Output

Supported Platforms

Project Structure

Key Principles

Technical Architecture

API Integrations

LLM Fallback Strategy

Web Scraping Strategy

Token Optimization

MCP Integration

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages