Skip to content

Releases: OCR-D/ocrd_tesserocr

v0.20.1

01 Apr 15:34
Compare
Choose a tag to compare

Changed:

  • ocrd-tool.json: remove configs/ as processor resource → no more directory resources
  • Dockerfile: supplant configs/ resource for standalone CLI by pre-installing in tessdata

Added:

  • ocrd-tool.json: add model resources for all Tesseract languages and scripts

v0.20.0

07 Mar 16:00
4d1de95
Compare
Choose a tag to compare

Changed:

  • adapt to (and require) ocrd>=3.0 – allows running
    • with pages in parallel (OCRD_MAX_PARALLEL_PAGES) in tandem with METS Server
    • with page timeout (OCRD_PROCESSING_PAGE_TIMEOUT)
    • with page failure fallback copycat (OCRD_MISSING_OUTPUT=COPY), new default is SKIP instead of ABORT (now via --debug)
    • with page completion re-runs (OCRD_EXISTING_OUTPUT=SKIP), which is the new default instead of ABORT (now via --overwrite)
  • switched to pyproject.toml build, tracking version via ocrd-tool.json

Added:

  • more test coverage (esp. modes w/o METS Server, METS caching, instance-caching, page-parallel)
  • Docker image includes preconfigured ocrd-all-tool.json for these processors

Fixed:

  • no more logging side effects between tests

v0.19.1

01 Jul 12:53
@kba kba
Compare
Choose a tag to compare

Fixed:

  • Correct version in ocrd-tool.json

v0.19.0

01 Jul 12:50
@kba kba
Compare
Choose a tag to compare

Fixed:

  • segment*/recognize: delegate process instead of overwrite for docstring
  • segment*/recognize: more robust polygon handling

Changed:

  • more and more concrete tests
  • 🔥 require Shapely v2
  • Update tesseract to 5.4.1, #214

v0.18.0

19 Feb 17:52
@kba kba
Compare
Choose a tag to compare

Changed:

  • tesseract and tesserocr included as submodules, installable via make instal-tesser{act,ocr}, #197
  • Updated docker setup accordingly, #197

v0.17.0

23 Mar 14:19
Compare
Choose a tag to compare

Fixed:

  • segment/recognize: fix shrink_polygons
  • segment/recognize: fix reinit scope (for xpath_model and auto_model)
  • CI: test multiple Python versions independent of ocrd/core image
  • CI: speed up build for EOL Python 3.6
  • CI: chmod o+w tessdata directory of PPA/OS Tesseract
  • deps-ubuntu: allow installation of PPA Tesseract to fail (for newer OS)

Changed:

  • adapted to Shapely v2
  • *: inherit from recognize (but override logger)
  • segment*: delegate constructor instead of wrapping instance
  • requires ocrd==2.48

v0.16.0

25 Oct 15:12
@kba kba
Compare
Choose a tag to compare

Changed:

  • require newer OCR-D/core to include OCR-D/core#934, #188
  • no more need to set TESSDATA_PREFIX
  • improved and up-to-date README

v0.15.0

23 Oct 13:54
@kba kba
Compare
Choose a tag to compare

Added:

  • binarize: dpi numerical parameter to specify pixel density, #186
  • binarize: tiseg boolean parameter to specify whether to call tessapi.AnalyseLayout for text-image separation, #186

Changed:

  • regonize: improved polygon handling, #186
  • resources: proper support for moduledir, companion to OCR-D/core#904, #187

v0.14.0

14 Aug 16:16
@kba kba
Compare
Choose a tag to compare

Changed:

v0.13.6

28 Sep 11:22
ac27465
Compare
Choose a tag to compare

Fixed:

  • segment/recognize: no find_tables when already looking for cells

Changed:

  • segment/recognize: add param find_staves (for pageseg_apply_music_mask)
  • segment/recognize: 🔥 set find_staves=false by default