Skip to content

security: add PDF extraction poisoning PoC for #273#274

Open
VimalMishra-Myanatomy wants to merge 1 commit into
interviewstreet:mainfrom
VimalMishra-Myanatomy:fix/pdf-extraction-poisoning-273
Open

security: add PDF extraction poisoning PoC for #273#274
VimalMishra-Myanatomy wants to merge 1 commit into
interviewstreet:mainfrom
VimalMishra-Myanatomy:fix/pdf-extraction-poisoning-273

Conversation

@VimalMishra-Myanatomy

Copy link
Copy Markdown

Summary

Adds reproducible proof-of-concept artifacts and assessment tooling for #273 — hidden white-on-white PDF text poisons resume extraction and inflates hiring scores on gemini-2.5-flash.

This PR does not implement a fix. It enables maintainers and contributors to reproduce the issue locally and validate a future fix.

Fixes #273

Problem

A visibly weak resume (todo app + calculator only) can score ~91 effective total when page 2 contains hidden forged sections (Google/Meta internships, GSoC, fake === GITHUB DATA ===) that PyMuPDF extracts but recruiters do not see.

Confirmed assessment output is included in security/poc/assessment_v2.json.

What's included

Path Purpose
security/poc/prompt_injection_resume_v2.pdf Confirmed exploit PoC (v2 — extraction poisoning)
security/poc/prompt_injection_resume_v3.pdf Combined variant for further testing
security/poc/prompt_injection_resume.pdf v1 blunt-injection control (usually resisted by Gemini)
security/poc/generate_prompt_injection_resume.py Regenerate PoC PDFs (--variant v1|v2|v3)
security/poc/run_poc_assessment.py Automated exploit success/fail checker
security/poc/assessment_v2.json Confirmed exploit_likely: true result
security/poc/README.md Usage instructions
security/issues/ Issue write-up drafts (reference only)

How to reproduce

pip install -r requirements.txt

LLM_PROVIDER=gemini
DEFAULT_MODEL=gemini-2.5-flash
GEMINI_API_KEY=<your-key>

python security/poc/generate_prompt_injection_resume.py --variant v2
python security/poc/run_poc_assessment.py \
  --pdf security/poc/prompt_injection_resume_v2.pdf \
  --no-cache \
  --json-out security/poc/assessment_v2.json
  
  

If preferred, happy to move PoC files to a `security/poc/` path behind docs-only guidance or keep artifacts issue-only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Security | Hidden PDF text poisons resume extraction and inflates hiring scores

1 participant