Refactor core architecture for extensibility, add SVG→PPTX, Chart→Code features and dockerfile#11
Open
kafkayu wants to merge 8 commits intoResearAI:mainfrom
Open
Refactor core architecture for extensibility, add SVG→PPTX, Chart→Code features and dockerfile#11kafkayu wants to merge 8 commits intoResearAI:mainfrom
kafkayu wants to merge 8 commits intoResearAI:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As the title suggests, this update has the following features compared to the current version:
Refactored core code functions, divided into folders such as autofigure, examples, and docker.
Added a Dockerfile with instructions on using it.
To better facilitate Docker deployment, the SAM3 code and required resources have been included.
To simplify image editing, an SVG to PPTX conversion function has been added, as well as a function to convert local images (without invoking the text-to-image (T2I) module) to SVG to PPT. Example code can be found in AutoFigure-Edit/examples/testchart_local.sh.
To comprehensively cover paper figures, a chart to Python code conversion function has been added. Users can optionally enable SAM3 as an auxiliary segmentation module, which may improve performance in certain cases.
Added chart to Python code evaluation, requiring the specification of the reference code path.
Sample Output:
sh examples/testchart_local.sh
============================================================
Paper Method 到 SVG 图标替换流程
Provider: local
输出目录: outputs/chart_demo_nosam
生图模型:
SVG模型: kimi-k2.5
============================================================
步骤一:使用 LLM 生成学术风格图片
Provider: local
模型:
发送请求到: xxx
图片已保存: outputs/chart_demo_nosam/figure.png
============================================================
学术图代码复现模式(不使用 SAM3):仅根据原图生成 Python 画图代码
============================================================
步骤四(学术图):多模态调用生成 Python 画图代码
Provider: local
模型: kimi-k2.5
发送多模态请求到: xxx
[Kimi] 图片 Base64 大小: 770728 字符
[Kimi] 发送请求: model=kimi-k2.5, max_tokens=50000
[Kimi] 消息内容包含: 1 个文本, 1 张图片
[Kimi] API 响应状态: completion=True, choices=1
[Kimi] 返回内容长度: 7031
[Kimi] finish_reason: stop
学术图 Python 代码已保存: outputs/chart_demo_nosam/chart_code.py
执行学术图代码脚本: chart_code.py (run_name=initial)
chart_code.py 运行完成,日志: outputs/chart_demo_nosam/chart_code_run_initial.log
执行结果:
figure_path: outputs/chart_demo_nosam/figure.png
chart_code_path: outputs/chart_demo_nosam/chart_code.py
reconstructed_chart_path: outputs/chart_demo_nosam/reconstructed_chart.png
============================================================
步骤七:评估生成的 Python 代码
Provider: local
模型: kimi-k2.5
生成代码: outputs/chart_demo_nosam/chart_code.py
参考代码: /app/examples/inputs/test.py
注意: 图片对比已禁用,仅基于代码进行评估
发送评估请求到: xxx
评分结果已保存: outputs/chart_demo_nosam/evaluation_scores.json
============================================================
评分结果:
参数准确度: 20/25
评价: 数据值基本正确,但(c)(d)(e)的分数有舍入误差(如2.0vs2.05),ylim设置(115vs110)与参考不符。
视觉相似度: 18/25
评价: 配色接近但存在偏差(如Reference的灰色),标题位置居中而非左对齐,误差线数值不一致。
代码可执行性: 25/25
评价: 代码结构完整,导入语句齐全,无语法错误,可独立运行并正确保存图像。
总分: 63/75
总体评价: 代码功能完整可运行,数据大体准确,但在视觉细节(颜色精确值、标题对齐方式、数据精度)上与参考代码存在可优化的差异。