Skip to content

kafkayu/AutoFigure-Edit

ย 
ย 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

28 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AutoFigure-edit Logo

AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations [ICLR 2026]

English | ไธญๆ–‡

ICLR 2026 License: MIT Python HuggingFace

From Method Text to Editable SVG
AutoFigure-edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.

Quick Start โ€ข Web Interface โ€ข How It Works โ€ข Configuration โ€ข Citation

[Paper] [Project] [BibTeX]


โœจ Features

Feature Description
๐Ÿ“ Text-to-Figure Generate a draft figure directly from method text.
๐Ÿง  SAM3 Icon Detection Detect icon regions from multiple prompts and merge overlaps.
๐ŸŽฏ Labeled Placeholders Insert consistent AF-style placeholders for reliable SVG mapping.
๐Ÿงฉ SVG Generation Produce an editable SVG template aligned to the figure.
๐Ÿ–ฅ๏ธ Embedded Editor Edit the SVG in-browser using the bundled svg-edit.
๐Ÿ“ฆ Artifact Outputs Save PNG/SVG outputs and icon crops per run.
๐Ÿ“Š Chart-to-Code Convert charts to Python code using SAM3 segmentation (optional) with code evaluation.
๐Ÿ“‘ SVG-to-PPT Export generated SVG figures directly to PowerPoint presentations.

๐ŸŽจ Gallery: Editable Vectorization & Style Transfer

AutoFigure-edit introduces two breakthrough capabilities:

  1. Fully Editable SVGs (Pure Code Implementation): Unlike raster images, our outputs are structured Vector Graphics (SVG). Every component is editableโ€”text, shapes, and layout can be modified losslessly.
  2. Style Transfer: The system can mimic the artistic style of reference images provided by the user.

Below are 9 examples covering 3 different papers. Each paper is generated using 3 different reference styles. (Each image shows: Left = AutoFigure Generation | Right = Vectorized Editable SVG)

Paper & Style Transfer Demonstration
CycleResearcher / Style 1
Paper 1 Style 1
CycleResearcher / Style 2
Paper 1 Style 2
CycleResearcher / Style 3
Paper 1 Style 3
DeepReviewer / Style 1
Paper 2 Style 1
DeepReviewer / Style 2
Paper 2 Style 2
DeepReviewer / Style 3
Paper 2 Style 3
DeepScientist / Style 1
Paper 3 Style 1
DeepScientist / Style 2
Paper 3 Style 2
DeepScientist / Style 3
Paper 3 Style 3

๐Ÿš€ How It Works

The AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:

Pipeline Visualization: Figure -> SAM -> Template -> Final
(1) Raw Generation โ†’ (2) SAM3 Segmentation โ†’ (3) SVG Layout Template โ†’ (4) Final Assembled Vector

  1. Generation (figure.png): The LLM generates a raster draft based on the method text.
  2. Segmentation (sam.png): SAM3 detects and segments distinct icons and text regions.
  3. Templating (template.svg): The system constructs a structural SVG wireframe using placeholders.
  4. Assembly (final.svg): High-quality cropped icons and vectorized text are injected into the template.
View Detailed Technical Pipeline
AutoFigure-edit Technical Pipeline

AutoFigure2โ€™s pipeline starts from the paperโ€™s method text and first calls a textโ€‘toโ€‘image LLM to render a journalโ€‘style schematic, saved as figure.png. The system then runs SAM3 segmentation on that image using one or more text prompts (e.g., โ€œicon, diagram, arrowโ€), merges overlapping detections by an IoUโ€‘like threshold, and draws grayโ€‘filled, blackโ€‘outlined labeled boxes on the original; this produces both samed.png (the labeled mask overlay) and a structured boxlib.json with coordinates, scores, and prompt sources.

Next, each box is cropped from the original figure and passed through RMBGโ€‘2.0 for background removal, yielding transparent icon assets under icons/*.png and *_nobg.png. With figure.png, samed.png, and boxlib.json as multimodal inputs, the LLM generates a placeholderโ€‘style SVG (template.svg) whose boxes match the labeled regions.

Optionally, the SVG is iteratively refined by an LLM optimizer to better align strokes, layouts, and styles, resulting in optimized_template.svg (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label/ID), producing the assembled final.svg.

Key configuration details:

  • Placeholder Mode: Controls how icon boxes are encoded in the prompt (label, box, or none).
  • Optimization: optimize_iterations=0 allows skipping the refinement step to use the raw structure directly.

โšก Quick Start

Option 1: Conda Environment

# 1) Create and activate conda environment
conda create -n autofigure python=3.10
conda activate autofigure

# 2) Install dependencies
pip install -r requirements.txt

# 3) Install SAM3 separately (not vendored in this repo)

pip install -e sam3

Option 2: Docker Deployment (Recommended)

# Build Docker image
docker build -f docker/Dockerfile -t autofigure:latest .

# Run container with GPU support
docker run --name autofigure \
  --gpus all \
  --shm-size 32g \
  -p 8000:8000 \
  --ipc=host \
  -v /path/to/models:/root/models \
  -v /path/to/code:/app/ \
  -it autofigure:latest /bin/bash

Option 3: Web Interface

python server.py

Then open http://localhost:8000.


Run:

# Basic usage with text-to-image generation
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider bianxie \
  --api_key YOUR_KEY

# Using local image (skip text-to-image generation)
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider local \
  --local_img_path path/to/your/image.png \
  --sam_checkpoint_path /path/to/sam3.pt

# Convert chart to Python code (with SAM3 segmentation)
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/chart_demo \
  --provider local \
  --local_img_path path/to/chart.png \
  --task_type chart_code \
  --chart_use_sam \
  --sam_checkpoint_path /path/to/sam3.pt \
  --sam_prompt "axis,line,curve,bar,marker,legend,grid" \
  --enable_evaluation \
  --reference_code_path path/to/reference.py

# Generate SVG and convert to PowerPoint
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider local \
  --local_img_path path/to/image.png \
  --sam_checkpoint_path /path/to/sam3.pt \
  --convert_to_ppt \
  --ppt_output_path outputs/demo/result.pptx

๐Ÿ–ฅ๏ธ Web Interface Demo

AutoFigure-edit provides a visual web interface designed for seamless generation and editing.

1. Configuration Page

Configuration Page

On the start page, paste your paper's method text on the left. On the right, configure your generation settings:

  • Provider: Select your LLM provider (OpenRouter or Bianxie).
  • Optimize: Set SVG template refinement iterations (recommend 0 for standard use).
  • Reference Image: Upload a target image to enable style transfer.
  • SAM3 Backend: Choose local SAM3 or the fal.ai API (API key optional).

2. Canvas & Editor

Canvas Page

The generation result loads directly into an integrated SVG-Edit canvas, allowing for full vector editing.

  • Status & Logs: Check real-time progress (top-left) and view detailed execution logs (top-right button).
  • Artifacts Drawer: Click the floating button (bottom-right) to expand the Artifacts Panel. This contains all intermediate outputs (icons, SVG templates, etc.). You can drag and drop any artifact directly onto the canvas for custom composition.

๐Ÿงฉ SAM3 Installation Notes

AutoFigure-edit depends on SAM3 but does not vendor it. Please follow the official SAM3 installation guide and prerequisites. The upstream repo currently targets Python 3.12+, PyTorch 2.7+, and CUDA 12.6 for GPU builds.

SAM3 checkpoints are hosted on Hugging Face and may require you to request access and authenticate (e.g., huggingface-cli login) before download.

SAM3 API Mode (No Local Install)

If you prefer not to install SAM3 locally, you can use an API backend (also supported in the Web demo). We recommend using Roboflow as it is free to use.

Option A: fal.ai

export FAL_KEY="your-fal-key"
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider bianxie \
  --api_key YOUR_KEY \
  --sam_backend fal

Option B: Roboflow

export ROBOFLOW_API_KEY="your-roboflow-key"
python autofigure_main.py \
  --method_file paper.txt \
  --output_dir outputs/demo \
  --provider bianxie \
  --api_key YOUR_KEY \
  --sam_backend roboflow

Optional CLI flags (API):

  • --sam_api_key (overrides FAL_KEY/ROBOFLOW_API_KEY)
  • --sam_max_masks (default: 32, fal.ai only)

โš™๏ธ Configuration

Supported LLM Providers

Provider Base URL Notes
OpenRouter openrouter.ai/api/v1 Supports Gemini/Claude/others
Bianxie api.bianxie.ai/v1 OpenAI-compatible API
Local N/A Use local images without text-to-image generation

Common CLI flags:

  • --provider (openrouter | bianxie | local)
  • --image_model, --svg_model
  • --local_img_path (path to local image when using local provider)
  • --task_type (icon_svg | chart_code, default: icon_svg)
  • --chart_use_sam (use SAM3 for chart code generation)
  • --enable_evaluation (enable code evaluation for chart_code mode)
  • --sam_prompt (comma-separated prompts)
  • --sam_backend (local | fal | roboflow | api)
  • --sam_checkpoint_path (path to SAM3 checkpoint)
  • --sam_api_key (API key override; falls back to FAL_KEY or ROBOFLOW_API_KEY)
  • --sam_max_masks (fal.ai max masks, default 32)
  • --merge_threshold (0 disables merging)
  • --optimize_iterations (0 disables optimization)
  • --reference_image_path (optional, for style transfer)
  • --convert_to_ppt (convert SVG to PowerPoint)
  • --ppt_output_path (PPT output path)
  • --reference_code_path (reference code path)

๐Ÿ“ Project Structure

Click to expand directory tree
AutoFigure-edit/
โ”œโ”€โ”€ autofigure_main.py         # Main entry point
โ”œโ”€โ”€ server.py                  # FastAPI backend for web interface
โ”œโ”€โ”€ requirements.txt           # Python dependencies
โ”œโ”€โ”€ autofigure/                # Core package
โ”‚   โ”œโ”€โ”€ config.py              # Configuration and provider settings
โ”‚   โ”œโ”€โ”€ pipeline/              # Pipeline modules
โ”‚   โ”‚   โ”œโ”€โ”€ step1_generate.py  # Text-to-image generation
โ”‚   โ”‚   โ”œโ”€โ”€ step2_sam.py       # SAM3 segmentation
โ”‚   โ”‚   โ”œโ”€โ”€ step3_rmbg.py      # Background removal
โ”‚   โ”‚   โ”œโ”€โ”€ step4_svg_template.py  # SVG template generation
โ”‚   โ”‚   โ”œโ”€โ”€ step4_chart_code.py    # ๅ›พ่กจ่ฝฌไปฃ็ 
โ”‚   โ”‚   โ”œโ”€โ”€ step5_replace_icons.py  # ๆœ€็ปˆ SVG ็ป„่ฃ…
โ”‚   โ”‚   โ”œโ”€โ”€ step6_optimize.py    # ๅคš่ฝฎๅ้ฆˆไผ˜ๅŒ–
โ”‚   โ”‚   โ””โ”€โ”€ step7_evaluate.py  # ่ฏ„ไผฐchart2codeไปฃ็ ็”Ÿๆˆ่ดจ้‡
โ”‚   โ”œโ”€โ”€ providers/             # LLM provider implementations
โ”‚   โ”‚   โ”œโ”€โ”€ openrouter.py
โ”‚   โ”‚   โ”œโ”€โ”€ bianxie.py
โ”‚   โ”‚   โ””โ”€โ”€ local.py           # Local image mode
โ”‚   โ”œโ”€โ”€ processors/            # Image processing utilities
โ”‚   โ”œโ”€โ”€ converters/            # Format converters (SVG to PPT)
โ”‚   โ””โ”€โ”€ utils/                 # Helper functions
โ”œโ”€โ”€ docker/                    # Docker deployment files
โ”‚   โ”œโ”€โ”€ Dockerfile
โ”‚   โ””โ”€โ”€ README.md
โ”œโ”€โ”€ examples/                  # Example scripts and inputs
โ”‚   โ”œโ”€โ”€ testfigure.sh
โ”‚   โ””โ”€โ”€ testchart_local.sh
โ”œโ”€โ”€ web/                       # Web interface frontend
โ”‚   โ”œโ”€โ”€ index.html
โ”‚   โ”œโ”€โ”€ canvas.html
โ”‚   โ”œโ”€โ”€ styles.css
โ”‚   โ”œโ”€โ”€ app.js
โ”‚   โ””โ”€โ”€ vendor/svg-edit/       # Embedded SVG editor
โ””โ”€โ”€ img/                       # README assets

๐Ÿค Community & Support

WeChat Discussion Group
Scan the QR code to join our community. If the code is expired, please add WeChat ID nauhcutnil or contact tuchuan@mail.hfut.edu.cn.

WeChat 1 WeChat 2

๐Ÿ“œ Citation & License

If you find AutoFigure or FigureBench helpful, please cite:

@inproceedings{
zhu2026autofigure,
title={AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations},
author={Minjun Zhu and Zhen Lin and Yixuan Weng and Panzhong Lu and Qiujie Xie and Yifan Wei and Yifan_Wei and Sifan Liu and QiYao Sun and Yue Zhang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=5N3z9JQJKq}
}

@dataset{figurebench2025,
  title = {FigureBench: A Benchmark for Automated Scientific Illustration Generation},
  author = {WestlakeNLP},
  year = {2025},
  url = {https://huggingface.co/datasets/WestlakeNLP/FigureBench}
}

This project is licensed under the MIT License - see LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 80.4%
  • Python 19.3%
  • Other 0.3%