From Method Text to Editable SVG
AutoFigure-edit is the next version of AutoFigure. It turns paper method sections into fully editable SVG figures and lets you refine them in an embedded SVG editor.
Quick Start โข Web Interface โข How It Works โข Configuration โข Citation
| Feature | Description |
|---|---|
| ๐ Text-to-Figure | Generate a draft figure directly from method text. |
| ๐ง SAM3 Icon Detection | Detect icon regions from multiple prompts and merge overlaps. |
| ๐ฏ Labeled Placeholders | Insert consistent AF-style placeholders for reliable SVG mapping. |
| ๐งฉ SVG Generation | Produce an editable SVG template aligned to the figure. |
| ๐ฅ๏ธ Embedded Editor | Edit the SVG in-browser using the bundled svg-edit. |
| ๐ฆ Artifact Outputs | Save PNG/SVG outputs and icon crops per run. |
| ๐ Chart-to-Code | Convert charts to Python code using SAM3 segmentation (optional) with code evaluation. |
| ๐ SVG-to-PPT | Export generated SVG figures directly to PowerPoint presentations. |
AutoFigure-edit introduces two breakthrough capabilities:
- Fully Editable SVGs (Pure Code Implementation): Unlike raster images, our outputs are structured Vector Graphics (SVG). Every component is editableโtext, shapes, and layout can be modified losslessly.
- Style Transfer: The system can mimic the artistic style of reference images provided by the user.
Below are 9 examples covering 3 different papers. Each paper is generated using 3 different reference styles. (Each image shows: Left = AutoFigure Generation | Right = Vectorized Editable SVG)
| Paper & Style Transfer Demonstration |
|---|
CycleResearcher / Style 1![]() |
CycleResearcher / Style 2![]() |
CycleResearcher / Style 3![]() |
DeepReviewer / Style 1![]() |
DeepReviewer / Style 2![]() |
DeepReviewer / Style 3![]() |
DeepScientist / Style 1![]() |
DeepScientist / Style 2![]() |
DeepScientist / Style 3![]() |
The AutoFigure-edit pipeline transforms a raw generation into an editable SVG in four distinct stages:
(1) Raw Generation โ (2) SAM3 Segmentation โ (3) SVG Layout Template โ (4) Final Assembled Vector
- Generation (
figure.png): The LLM generates a raster draft based on the method text. - Segmentation (
sam.png): SAM3 detects and segments distinct icons and text regions. - Templating (
template.svg): The system constructs a structural SVG wireframe using placeholders. - Assembly (
final.svg): High-quality cropped icons and vectorized text are injected into the template.
View Detailed Technical Pipeline
AutoFigure2โs pipeline starts from the paperโs method text and first calls a textโtoโimage LLM to render a journalโstyle schematic, saved as figure.png. The system then runs SAM3 segmentation on that image using one or more text prompts (e.g., โicon, diagram, arrowโ), merges overlapping detections by an IoUโlike threshold, and draws grayโfilled, blackโoutlined labeled boxes on the original; this produces both samed.png (the labeled mask overlay) and a structured boxlib.json with coordinates, scores, and prompt sources.
Next, each box is cropped from the original figure and passed through RMBGโ2.0 for background removal, yielding transparent icon assets under icons/*.png and *_nobg.png. With figure.png, samed.png, and boxlib.json as multimodal inputs, the LLM generates a placeholderโstyle SVG (template.svg) whose boxes match the labeled regions.
Optionally, the SVG is iteratively refined by an LLM optimizer to better align strokes, layouts, and styles, resulting in optimized_template.svg (or the original template if optimization is skipped). The system then compares the SVG dimensions with the original figure to compute scale factors and aligns coordinate systems. Finally, it replaces each placeholder in the SVG with the corresponding transparent icon (matched by label/ID), producing the assembled final.svg.
Key configuration details:
- Placeholder Mode: Controls how icon boxes are encoded in the prompt (
label,box, ornone). - Optimization:
optimize_iterations=0allows skipping the refinement step to use the raw structure directly.
# 1) Create and activate conda environment
conda create -n autofigure python=3.10
conda activate autofigure
# 2) Install dependencies
pip install -r requirements.txt
# 3) Install SAM3 separately (not vendored in this repo)
pip install -e sam3# Build Docker image
docker build -f docker/Dockerfile -t autofigure:latest .
# Run container with GPU support
docker run --name autofigure \
--gpus all \
--shm-size 32g \
-p 8000:8000 \
--ipc=host \
-v /path/to/models:/root/models \
-v /path/to/code:/app/ \
-it autofigure:latest /bin/bashpython server.pyThen open http://localhost:8000.
Run:
# Basic usage with text-to-image generation
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEY
# Using local image (skip text-to-image generation)
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider local \
--local_img_path path/to/your/image.png \
--sam_checkpoint_path /path/to/sam3.pt
# Convert chart to Python code (with SAM3 segmentation)
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/chart_demo \
--provider local \
--local_img_path path/to/chart.png \
--task_type chart_code \
--chart_use_sam \
--sam_checkpoint_path /path/to/sam3.pt \
--sam_prompt "axis,line,curve,bar,marker,legend,grid" \
--enable_evaluation \
--reference_code_path path/to/reference.py
# Generate SVG and convert to PowerPoint
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider local \
--local_img_path path/to/image.png \
--sam_checkpoint_path /path/to/sam3.pt \
--convert_to_ppt \
--ppt_output_path outputs/demo/result.pptxAutoFigure-edit provides a visual web interface designed for seamless generation and editing.
On the start page, paste your paper's method text on the left. On the right, configure your generation settings:
- Provider: Select your LLM provider (OpenRouter or Bianxie).
- Optimize: Set SVG template refinement iterations (recommend
0for standard use). - Reference Image: Upload a target image to enable style transfer.
- SAM3 Backend: Choose local SAM3 or the fal.ai API (API key optional).
The generation result loads directly into an integrated SVG-Edit canvas, allowing for full vector editing.
- Status & Logs: Check real-time progress (top-left) and view detailed execution logs (top-right button).
- Artifacts Drawer: Click the floating button (bottom-right) to expand the Artifacts Panel. This contains all intermediate outputs (icons, SVG templates, etc.). You can drag and drop any artifact directly onto the canvas for custom composition.
AutoFigure-edit depends on SAM3 but does not vendor it. Please follow the official SAM3 installation guide and prerequisites. The upstream repo currently targets Python 3.12+, PyTorch 2.7+, and CUDA 12.6 for GPU builds.
SAM3 checkpoints are hosted on Hugging Face and may require you to request
access and authenticate (e.g., huggingface-cli login) before download.
- SAM3 repo: https://github.com/facebookresearch/sam3
- SAM3 Hugging Face: https://huggingface.co/facebook/sam3
If you prefer not to install SAM3 locally, you can use an API backend (also supported in the Web demo). We recommend using Roboflow as it is free to use.
Option A: fal.ai
export FAL_KEY="your-fal-key"
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEY \
--sam_backend falOption B: Roboflow
export ROBOFLOW_API_KEY="your-roboflow-key"
python autofigure_main.py \
--method_file paper.txt \
--output_dir outputs/demo \
--provider bianxie \
--api_key YOUR_KEY \
--sam_backend roboflowOptional CLI flags (API):
--sam_api_key(overridesFAL_KEY/ROBOFLOW_API_KEY)--sam_max_masks(default: 32, fal.ai only)
| Provider | Base URL | Notes |
|---|---|---|
| OpenRouter | openrouter.ai/api/v1 |
Supports Gemini/Claude/others |
| Bianxie | api.bianxie.ai/v1 |
OpenAI-compatible API |
| Local | N/A | Use local images without text-to-image generation |
Common CLI flags:
--provider(openrouter | bianxie | local)--image_model,--svg_model--local_img_path(path to local image when using local provider)--task_type(icon_svg | chart_code, default: icon_svg)--chart_use_sam(use SAM3 for chart code generation)--enable_evaluation(enable code evaluation for chart_code mode)--sam_prompt(comma-separated prompts)--sam_backend(local | fal | roboflow | api)--sam_checkpoint_path(path to SAM3 checkpoint)--sam_api_key(API key override; falls back toFAL_KEYorROBOFLOW_API_KEY)--sam_max_masks(fal.ai max masks, default 32)--merge_threshold(0 disables merging)--optimize_iterations(0 disables optimization)--reference_image_path(optional, for style transfer)--convert_to_ppt(convert SVG to PowerPoint)--ppt_output_path(PPT output path)--reference_code_path(reference code path)
Click to expand directory tree
AutoFigure-edit/
โโโ autofigure_main.py # Main entry point
โโโ server.py # FastAPI backend for web interface
โโโ requirements.txt # Python dependencies
โโโ autofigure/ # Core package
โ โโโ config.py # Configuration and provider settings
โ โโโ pipeline/ # Pipeline modules
โ โ โโโ step1_generate.py # Text-to-image generation
โ โ โโโ step2_sam.py # SAM3 segmentation
โ โ โโโ step3_rmbg.py # Background removal
โ โ โโโ step4_svg_template.py # SVG template generation
โ โ โโโ step4_chart_code.py # ๅพ่กจ่ฝฌไปฃ็
โ โ โโโ step5_replace_icons.py # ๆ็ป SVG ็ป่ฃ
โ โ โโโ step6_optimize.py # ๅค่ฝฎๅ้ฆไผๅ
โ โ โโโ step7_evaluate.py # ่ฏไผฐchart2codeไปฃ็ ็ๆ่ดจ้
โ โโโ providers/ # LLM provider implementations
โ โ โโโ openrouter.py
โ โ โโโ bianxie.py
โ โ โโโ local.py # Local image mode
โ โโโ processors/ # Image processing utilities
โ โโโ converters/ # Format converters (SVG to PPT)
โ โโโ utils/ # Helper functions
โโโ docker/ # Docker deployment files
โ โโโ Dockerfile
โ โโโ README.md
โโโ examples/ # Example scripts and inputs
โ โโโ testfigure.sh
โ โโโ testchart_local.sh
โโโ web/ # Web interface frontend
โ โโโ index.html
โ โโโ canvas.html
โ โโโ styles.css
โ โโโ app.js
โ โโโ vendor/svg-edit/ # Embedded SVG editor
โโโ img/ # README assets
WeChat Discussion Group
Scan the QR code to join our community. If the code is expired, please add WeChat ID nauhcutnil or contact tuchuan@mail.hfut.edu.cn.
![]() |
![]() |
If you find AutoFigure or FigureBench helpful, please cite:
@inproceedings{
zhu2026autofigure,
title={AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations},
author={Minjun Zhu and Zhen Lin and Yixuan Weng and Panzhong Lu and Qiujie Xie and Yifan Wei and Yifan_Wei and Sifan Liu and QiYao Sun and Yue Zhang},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=5N3z9JQJKq}
}
@dataset{figurebench2025,
title = {FigureBench: A Benchmark for Automated Scientific Illustration Generation},
author = {WestlakeNLP},
year = {2025},
url = {https://huggingface.co/datasets/WestlakeNLP/FigureBench}
}This project is licensed under the MIT License - see LICENSE for details.













