Skip to content

HuangTM23/SurveyForge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SurveyForge: An LLM-Orchestrated Literature Survey Agent

基于多 LLM 协作的文献检索、筛选、阅读与综述生成系统。

中文说明 / Chinese README

SurveyForge is a file-based AI literature survey agent. It starts from a natural-language research topic, plans broad scholarly search queries, retrieves candidate papers, enriches metadata, performs multi-LLM screening and classification, manages PDF download tasks, and produces structured reading cards, category summaries, a global survey summary, and CSV tables for review writing.

Project Goal / 项目目标

SurveyForge is designed for researchers writing technical literature reviews. The system separates deterministic engineering tasks from LLM decisions:

  • Python scripts handle file IO, retrieval, OpenAlex/Crossref enrichment, PDF matching, CSV/JSON/Markdown export, and resumable orchestration.
  • LLMs handle planning, candidate filtering, title classification, category-level selection, single-paper reading, category summaries, and global review writing.
  • Every intermediate artifact is saved locally so each step can be inspected, debugged, and rerun without starting from scratch.

SurveyForge 面向技术综述写作,把文献调研拆成可检查、可断点续跑的流水线。每个关键判断由 LLM 完成,本地脚本只负责稳定可重复的工程任务。

Release Version / 发行版本

Current release: SurveyForge v1.0.

当前发行版本是 SurveyForge v1.0

Repository Layout / 仓库结构

.
├── README.md
├── requirements.txt
├── LICENSE
└── surveyforge/
    ├── cli.py
    ├── .env.example
    ├── configs/
    │   ├── defaults.yaml
    │   └── runtime.yaml
    ├── prompts/
    │   ├── planner.md
    │   ├── candidate_filter.md
    │   ├── title_classify.md
    │   ├── intent_select.md
    │   ├── paper_reading_card.md
    │   ├── category_summary.md
    │   ├── review_polish.md
    │   └── provider_healthcheck.md
    └── src/
        ├── core/
        ├── stage1/
        ├── stage2/
        └── tools/

Generated outputs are intentionally ignored by Git:

surveyforge/temp/      intermediate artifacts
surveyforge/final/     final reports and tables
surveyforge/papers/    downloaded or manually collected PDFs
surveyforge/.env       local API keys

Installation / 安装

Python 3.8+ is recommended.

pip install -r requirements.txt
python3 -m playwright install

Stage 2 uses pdftotext for PDF text extraction. On Ubuntu:

sudo apt-get install poppler-utils

LLM Providers / 大模型 Provider

Supported providers:

  • chatgpt: smart provider. Uses OpenAI API if OPENAI_API_KEY exists in surveyforge/.env; otherwise falls back to logged-in Codex CLI.
  • gemini: smart provider. Uses Gemini API if GEMINI_API_KEY exists in surveyforge/.env; otherwise falls back to logged-in Gemini CLI.
  • kimi: Moonshot API.
  • deepseek: DeepSeek API.
  • xiaomi_mimo: Xiaomi MiMo API.

Default models:

Provider Default backend / model
chatgpt gpt-5.4 if API key exists, otherwise Codex CLI
gemini gemini-3-flash-preview if API key exists, otherwise Gemini CLI
kimi kimi-k2.5
deepseek deepseek-v4-pro, reasoning_effort=high, thinking enabled
xiaomi_mimo mimo-v2.5-pro

Create local credentials:

cp surveyforge/.env.example surveyforge/.env

Fill only the keys you use:

MOONSHOT_API_KEY=...
DEEPSEEK_API_KEY=...
MIMO_API_KEY=...
GEMINI_API_KEY=...
OPENAI_API_KEY=...

Provider health check:

python3 surveyforge/cli.py providers
python3 surveyforge/cli.py providers --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo

Stage 1: Paper Selection / 第一阶段:文献挑选

Stage 1 produces selected papers for deep reading.

第一阶段从自然语言调研主题出发,完成检索规划、候选检索、元数据补全、预过滤、LLM 分类筛选和 PDF 下载队列整理。

Pipeline:

Natural-language topic
  -> LLM planning
  -> Google Scholar / mirror / browser-assisted retrieval
  -> OpenAlex first, Crossref fallback metadata enrichment
  -> local prefilter by year/title/venue
  -> LLM candidate filtering
  -> title-only classification
  -> multi-provider category-level selection
  -> selected_papers
  -> PDF download or manual download queue

Run all Stage 1 steps:

python3 surveyforge/cli.py stage1 --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --retrieve-mode browser --batch-size 20

Run step by step:

python3 surveyforge/cli.py plan --provider chatgpt
python3 surveyforge/cli.py retrieve --mode browser
python3 surveyforge/cli.py enrich
python3 surveyforge/cli.py prefilter
python3 surveyforge/cli.py screen --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --batch-size 20
python3 surveyforge/cli.py download

Important outputs:

surveyforge/temp/stage1/step1/planning.json
surveyforge/temp/stage1/step2/candidate_pool_raw.jsonl
surveyforge/temp/stage1/step2/candidate_pool_enriched.jsonl
surveyforge/temp/stage1/step2/candidate_pool_pre_filtered.jsonl
surveyforge/temp/stage1/step3/title_clusters.json
surveyforge/temp/stage1/step3/selected_papers.jsonl
surveyforge/temp/stage1/step4/manual_download_queue.md

Manual PDFs should be placed under:

surveyforge/papers/manual/

Stage 2: Deep Reading and Survey Report / 第二阶段:深度阅读与综述报告

Stage 2 reads selected papers by category and produces structured outputs for review writing.

第二阶段基于精选文献和本地 PDF 生成单篇阅读卡、类别综述、全局综述和 CSV 表格。

Pipeline:

selected_papers
  -> group by category
  -> category-parallel single-paper reading
  -> PDF matching and text extraction
  -> optional enhanced metric reading from relevant figure/table captions
  -> paper_review_cards
  -> compact category cards
  -> category summaries by main provider
  -> global summary by main provider
  -> CSV / JSON / Markdown outputs

Run Stage 2:

python3 surveyforge/cli.py stage2 --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --max-workers 3

Small test, one paper per category:

python3 surveyforge/cli.py stage2 --provider deepseek --providers deepseek,kimi,xiaomi_mimo --max-workers 3 --papers-per-category 1

Enhanced metric extraction:

python3 surveyforge/cli.py stage2 --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --max-workers 3 --enhanced-metrics

--enhanced-metrics selects only relevant figure/table captions and nearby text. It focuses on experiment, result, accuracy, error, trajectory, environment, sensor, RMSE, mean error, P90, CDF, and related metric windows. It does not perform image OCR.

Rerun only category and global summaries without reading PDFs again:

python3 surveyforge/cli.py summarize --provider deepseek --providers deepseek,kimi,xiaomi_mimo

Important outputs:

surveyforge/temp/stage2/<category_id>/paper_review_cards.jsonl
surveyforge/temp/stage2/<category_id>/category_reading_cards_compact.json
surveyforge/temp/stage2/<category_id>/category_summary.json

surveyforge/final/stage2/paper_review_cards.jsonl
surveyforge/final/stage2/category_summaries.json
surveyforge/final/stage2/global_summary.json
surveyforge/final/stage2/survey_table.csv
surveyforge/final/stage2/survey_review.json

Full Pipeline / 全流程运行

python3 surveyforge/cli.py all --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --retrieve-mode browser --batch-size 20 --max-workers 3

With enhanced metric extraction:

python3 surveyforge/cli.py all --provider chatgpt --providers chatgpt,gemini,kimi,deepseek,xiaomi_mimo --retrieve-mode browser --batch-size 20 --max-workers 3 --enhanced-metrics

stage1, stage2, summarize, and all run provider preflight checks. If any requested provider is unavailable, the command stops and prints available providers.

Cleaning Outputs / 清理运行产物

Clean generated temp and final outputs:

python3 surveyforge/cli.py clean

Also remove PDFs:

python3 surveyforge/cli.py clean --include-papers

Notes and Limitations / 注意事项

  • Google Scholar may block automated requests. Browser-assisted mode is recommended when verification or mirror navigation is needed.
  • OpenAlex is used first for metadata enrichment, Crossref is used as fallback. Papers not found in either source are discarded.
  • Publisher PDFs should be downloaded only through legal access paths. For subscription-only papers, use the manual queue and place files under surveyforge/papers/manual/.
  • Stage 2 PDF reading depends on extractable text. Scanned figures without embedded text may require manual checking.
  • LLM outputs should be reviewed before being used in a formal publication.

License / 许可证

MIT License.

About

An LLM-Orchestrated Literature Survey Agent

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages