·
1 commit
to master
since this release
What's Changed
This version includes several fixes and improvements to enhance parsing efficiency and accuracy:
- Performance Optimization
- Increased classification speed for PDF documents in auto mode.
- Parsing Optimization
- Improved parsing logic for documents containing watermarks, significantly enhancing the parsing results for such documents.
- Enhanced the matching logic for multiple images/tables and captions within a single page, improving the accuracy of image-text matching in complex layouts.
- Bug Fixes
- Fixed an issue where image/table spans were incorrectly filled into text blocks under certain conditions.
- Resolved an issue where title blocks were empty in some cases.
这个版本我们修复了一些问题,提升了解析的效率与精度:
- 性能优化
- auto模式下pdf文档的分类速度提升
- 在华为昇腾 NPU 加速模式下,添加高性能插件支持,常见场景下端到端加速可达 300% 申请链接
- 解析优化
- 优化对包含水印文档的解析逻辑,显著提升包含水印文档的解析效果
- 改进了单页内多个图像/表格与caption的匹配逻辑,提升了复杂布局下图文匹配的准确性
- 问题修复
- 修复在某些情况下图片/表格span被填充进textblock导致的异常
- 修复在某些情况下标题block为空的问题
New Contributors
- @shniubobo made their first contribution in #1693
- @xuzijie1995 made their first contribution in #1743
- @nadahlberg made their first contribution in #1748
Full Changelog: magic_pdf-1.1.0-released...magic_pdf-1.2.0-released