An open-source web app that visualizes the references of an uploaded scientific paper as an interactive graph.
- Upload a PDF file of a paper
- Extract title and references using Grobid
- Display an interactive graph with the paper in the center and references around it
- Show in-text citation counts on each reference node
- Select top 5 most-cited references in the paper
- Enrich references via Crossref (children refs when DOI available)
- Frontend: SvelteKit 2, TailwindCSS, force-graph/d3
- Backend: FastAPI (Python)
- PDF Processing: Grobid
- Data Enrichment: Crossref API
-
Clone the repository:
git clone https://github.com/YasharSL/Ref-Graph.git cd Ref-Graph -
Prerequisites:
- Docker & Docker Compose v2
- Node.js 18+ and npm
- Python 3.10+ (only if running backend outside Docker)
-
Set up the backend (if not using Docker):
cd backend python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install -r requirements.txt -
Set up the frontend:
cd frontend npm install
-
Start services (Grobid + Backend) with Docker Compose:
docker-compose up -d --build- Exposes Grobid on http://localhost:8070 and API on http://localhost:8000
-
Run the frontend (in a separate terminal):
cd frontend # Optional: configure API base URL (defaults to http://localhost:8000) # echo VITE_API_BASE=http://localhost:8000 > .env.local npm run dev -
Open http://localhost:5173 in your browser and upload a PDF.
If you prefer to run the backend outside Docker, ensure Grobid is reachable (see GROBID_URL below), then:
cd backend
uvicorn app.main:app --reload
- Backend environment variables:
GROBID_URL(default:http://localhost:8070/api)CROSSREF_MAILTO(optional, adds a mailto to the Crossref User-Agent for rate limits)
- Frontend environment variables:
VITE_API_BASE(default:http://localhost:8000)
To set frontend env locally, create frontend/.env.local:
VITE_API_BASE=http://localhost:8000
- Endpoint:
POST /upload - Content-Type:
multipart/form-data - Form field:
file(the PDF)
Example (PowerShell):
curl -Method POST -Uri http://localhost:8000/upload -Form @{
file = Get-Item .\sample.pdf
}
Example response (truncated):
{
"title": "Paper Title",
"references": [
{
"id": "b0",
"title": "Reference Title",
"doi": "10.1234/abcd.1",
"count": 3,
"title_corrected": true,
"children": [ { "title": "Child Ref", "doi": "10.5678/xyz" } ]
}
],
"debug": {
"total_references_found": 42
}
}
Notes:
- The backend selects the top 5 references by in-text citation count.
- Crossref enrichment requires network access; providing
CROSSREF_MAILTOis recommended.
From frontend/:
npm run dev # start dev server
npm run build # build to dist/
npm run preview # preview built site
npm run check # Svelte/TypeScript checks
- Grobid not reachable: confirm
http://localhost:8070is up (docker ps,docker logs <grobid-container>). - 500/timeout on upload: large PDFs or network issues; try smaller file or check Grobid logs.
- Crossref data missing: ensure internet access and consider setting
CROSSREF_MAILTO. - CORS issues in browser: backend enables
*by default; verifyfrontend/src/lib/config.tspoints to the correct API.
Pull requests are welcome. For major changes, please open an issue first.
MIT