当使用ocrmypdf输入 PDF 为中文时，结果复制PDF 中有额外的空格 #1391

deict · 2024-08-30T10:02:00Z

Simple sanity checks

This is an issue with an app that uses OCRmyPDF for OCR
I am using a recent version of the third party app
I will include a file that reproduces the issuse

Third party app name and version

No response

Describe the bug

使用ocrmysql识别后的截图
从识别后的pdf复制的内容‘短期负荷预测; 影响负荷的因素很多 ,存在着不确定性 ,首先需要进行一定的数据清洗 ,
过滁掉一些数据 , 然后进行特征选择 , 选取上一天的负荷’
这些中间会有一些空格

运行截图

Steps to reproduce

1. Import attached file into Paperless-ngx
2. Trigger OCR
3. Check log file
4. ....\ocrmypdf -l chi_sim C:/Users/15179/Pictures/pp.pdf C:/Users/15179/Pictures/pp2.pdf

Files

pp.pdf

OCRmyPDF version

16.4.3

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

deict added the triage Issue needs triage label Aug 30, 2024

deict assigned jbarlow83 Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

当使用ocrmypdf输入 PDF 为中文时，结果复制PDF 中有额外的空格 #1391

当使用ocrmypdf输入 PDF 为中文时，结果复制PDF 中有额外的空格 #1391

deict commented Aug 30, 2024

当使用ocrmypdf输入 PDF 为中文时，结果 复制PDF 中有额外的空格 #1391

当使用ocrmypdf输入 PDF 为中文时，结果 复制PDF 中有额外的空格 #1391

Comments

deict commented Aug 30, 2024

Simple sanity checks

Third party app name and version

Describe the bug

Steps to reproduce

Files

OCRmyPDF version

Relevant log output

当使用ocrmypdf输入 PDF 为中文时，结果复制PDF 中有额外的空格 #1391

当使用ocrmypdf输入 PDF 为中文时，结果复制PDF 中有额外的空格 #1391