Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{vis}[GCCcore/13.3.0] tesseract v5.5.0, Leptonica v1.85.0 #22318

Merged

Conversation

pavelToman
Copy link
Collaborator

@pavelToman pavelToman commented Feb 17, 2025

(created using eb --new-pr)
resolves vscentrum/vsc-software-stack#511

Copy link

Updated software Leptonica-1.85.0-GCCcore-13.3.0.eb

Diff against Leptonica-1.84.1-GCCcore-12.3.0.eb

easybuild/easyconfigs/l/Leptonica/Leptonica-1.84.1-GCCcore-12.3.0.eb

diff --git a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.84.1-GCCcore-12.3.0.eb b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
index ee92fb98c6..88731b6488 100644
--- a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.84.1-GCCcore-12.3.0.eb
+++ b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
@@ -1,27 +1,27 @@
 easyblock = 'ConfigureMake'
 
 name = 'Leptonica'
-version = '1.84.1'
+version = '1.85.0'
 
 homepage = 'http://www.leptonica.org'
 description = """Leptonica is a collection of pedagogically-oriented open source software
  that is broadly useful for image processing and image analysis applications."""
 
-toolchain = {'name': 'GCCcore', 'version': '12.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 source_urls = ['https://github.com/DanBloomberg/leptonica/releases/download/%(version)s/']
 sources = [SOURCELOWER_TAR_GZ]
-checksums = ['2b3e1254b1cca381e77c819b59ca99774ff43530209b9aeb511e1d46588a64f6']
+checksums = ['3745ae3bf271a6801a2292eead83ac926e3a9bc1bf622e9cd4dd0f3786e17205']
 
-builddependencies = [('binutils', '2.40')]
+builddependencies = [('binutils', '2.42')]
 
 dependencies = [
-    ('libpng', '1.6.39'),
-    ('LibTIFF', '4.5.0'),
-    ('libjpeg-turbo', '2.1.5.1'),
+    ('libpng', '1.6.43'),
+    ('LibTIFF', '4.6.0'),
+    ('libjpeg-turbo', '3.0.1'),
     ('giflib', '5.2.1'),
-    ('libwebp', '1.3.1'),
-    ('zlib', '1.2.13'),
+    ('libwebp', '1.4.0'),
+    ('zlib', '1.3.1'),
 ]
 
 sanity_check_paths = {
Diff against Leptonica-1.83.0-GCCcore-11.3.0.eb

easybuild/easyconfigs/l/Leptonica/Leptonica-1.83.0-GCCcore-11.3.0.eb

diff --git a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.83.0-GCCcore-11.3.0.eb b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
index d4743b6cc7..88731b6488 100644
--- a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.83.0-GCCcore-11.3.0.eb
+++ b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
@@ -1,27 +1,27 @@
 easyblock = 'ConfigureMake'
 
 name = 'Leptonica'
-version = '1.83.0'
+version = '1.85.0'
 
 homepage = 'http://www.leptonica.org'
 description = """Leptonica is a collection of pedagogically-oriented open source software
  that is broadly useful for image processing and image analysis applications."""
 
-toolchain = {'name': 'GCCcore', 'version': '11.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 source_urls = ['https://github.com/DanBloomberg/leptonica/releases/download/%(version)s/']
 sources = [SOURCELOWER_TAR_GZ]
-checksums = ['206591dd58cf84ef380836dad133b58c9d1af92491f5a9825c346a162044bcfe']
+checksums = ['3745ae3bf271a6801a2292eead83ac926e3a9bc1bf622e9cd4dd0f3786e17205']
 
-builddependencies = [('binutils', '2.38')]
+builddependencies = [('binutils', '2.42')]
 
 dependencies = [
-    ('libpng', '1.6.37'),
-    ('LibTIFF', '4.3.0'),
-    ('libjpeg-turbo', '2.1.3'),
+    ('libpng', '1.6.43'),
+    ('LibTIFF', '4.6.0'),
+    ('libjpeg-turbo', '3.0.1'),
     ('giflib', '5.2.1'),
-    ('libwebp', '1.2.4'),
-    ('zlib', '1.2.12'),
+    ('libwebp', '1.4.0'),
+    ('zlib', '1.3.1'),
 ]
 
 sanity_check_paths = {
Diff against Leptonica-1.82.0-GCCcore-10.3.0.eb

easybuild/easyconfigs/l/Leptonica/Leptonica-1.82.0-GCCcore-10.3.0.eb

diff --git a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.82.0-GCCcore-10.3.0.eb b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
index 35dbc6b8a3..88731b6488 100644
--- a/easybuild/easyconfigs/l/Leptonica/Leptonica-1.82.0-GCCcore-10.3.0.eb
+++ b/easybuild/easyconfigs/l/Leptonica/Leptonica-1.85.0-GCCcore-13.3.0.eb
@@ -1,31 +1,31 @@
 easyblock = 'ConfigureMake'
 
 name = 'Leptonica'
-version = '1.82.0'
+version = '1.85.0'
 
 homepage = 'http://www.leptonica.org'
 description = """Leptonica is a collection of pedagogically-oriented open source software
  that is broadly useful for image processing and image analysis applications."""
 
-toolchain = {'name': 'GCCcore', 'version': '10.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 source_urls = ['https://github.com/DanBloomberg/leptonica/releases/download/%(version)s/']
 sources = [SOURCELOWER_TAR_GZ]
-checksums = ['155302ee914668c27b6fe3ca9ff2da63b245f6d62f3061c8f27563774b8ae2d6']
+checksums = ['3745ae3bf271a6801a2292eead83ac926e3a9bc1bf622e9cd4dd0f3786e17205']
 
-builddependencies = [('binutils', '2.36.1')]
+builddependencies = [('binutils', '2.42')]
 
 dependencies = [
-    ('libpng', '1.6.37'),
-    ('LibTIFF', '4.2.0'),
-    ('libjpeg-turbo', '2.0.6'),
+    ('libpng', '1.6.43'),
+    ('LibTIFF', '4.6.0'),
+    ('libjpeg-turbo', '3.0.1'),
     ('giflib', '5.2.1'),
-    ('libwebp', '1.2.0'),
-    ('zlib', '1.2.11'),
+    ('libwebp', '1.4.0'),
+    ('zlib', '1.3.1'),
 ]
 
 sanity_check_paths = {
-    'files': ['bin/convertformat', 'lib/liblept.%s' % SHLIB_EXT],
+    'files': ['bin/convertformat'],
     'dirs': ['include/leptonica', 'lib/pkgconfig']
 }
 

Updated software tesseract-5.5.0-GCCcore-13.3.0.eb

Diff against tesseract-5.3.4-GCCcore-12.3.0.eb

easybuild/easyconfigs/t/tesseract/tesseract-5.3.4-GCCcore-12.3.0.eb

diff --git a/easybuild/easyconfigs/t/tesseract/tesseract-5.3.4-GCCcore-12.3.0.eb b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
index 94cf0915ff..edd4d41c4f 100644
--- a/easybuild/easyconfigs/t/tesseract/tesseract-5.3.4-GCCcore-12.3.0.eb
+++ b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
@@ -1,13 +1,13 @@
 easyblock = 'CMakeMake'
 
 name = 'tesseract'
-version = '5.3.4'
+version = '5.5.0'
 _tessdata_ver = '4.1.0'
 
 homepage = 'https://github.com/tesseract-ocr/tesseract'
 description = """Tesseract is an optical character recognition engine"""
 
-toolchain = {'name': 'GCCcore', 'version': '12.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 github_account = 'tesseract-ocr'
 source_urls = [GITHUB_SOURCE]
@@ -20,28 +20,28 @@ sources = [
     },
 ]
 checksums = [
-    {'5.3.4.tar.gz': '141afc12b34a14bb691a939b4b122db0d51bd38feda7f41696822bacea7710c7'},
+    {'5.5.0.tar.gz': 'f2fb34ca035b6d087a42875a35a7a5c4155fa9979c6132365b1e5a28ebc3fc11'},
     {'tessdata_best-4.1.0.tar.gz': 'bb05b738298ae73e7130e2913ed002b49d94cd1cea508e63be1928fe47770b32'},
 ]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('binutils', '2.40'),
-    ('pkgconf', '1.9.5')
+    ('CMake', '3.29.3'),
+    ('binutils', '2.42'),
+    ('pkgconf', '2.2.0'),
 ]
 
 dependencies = [
-    ('zlib', '1.2.13'),
-    ('libpng', '1.6.39'),
-    ('libjpeg-turbo', '2.1.5.1'),
-    ('LibTIFF', '4.5.0'),
-    ('Leptonica', '1.84.1'),
-    ('libarchive', '3.6.2'),
-    ('ICU', '73.2'),
-    ('fontconfig', '2.14.2'),
-    ('GLib', '2.77.1'),
-    ('cairo', '1.17.8'),
-    ('Pango', '1.50.14'),
+    ('zlib', '1.3.1'),
+    ('libpng', '1.6.43'),
+    ('libjpeg-turbo', '3.0.1'),
+    ('LibTIFF', '4.6.0'),
+    ('Leptonica', '1.85.0'),
+    ('libarchive', '3.7.4'),
+    ('ICU', '75.1'),
+    ('fontconfig', '2.15.0'),
+    ('GLib', '2.80.4'),
+    ('cairo', '1.18.0'),
+    ('Pango', '1.54.0'),
 ]
 
 configopts = ['-DBUILD_SHARED_LIBS=ON', '-DBUILD_SHARED_LIBS=OFF']
Diff against tesseract-5.3.0-GCCcore-11.3.0.eb

easybuild/easyconfigs/t/tesseract/tesseract-5.3.0-GCCcore-11.3.0.eb

diff --git a/easybuild/easyconfigs/t/tesseract/tesseract-5.3.0-GCCcore-11.3.0.eb b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
index 9bf893a4d1..edd4d41c4f 100644
--- a/easybuild/easyconfigs/t/tesseract/tesseract-5.3.0-GCCcore-11.3.0.eb
+++ b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
@@ -1,12 +1,13 @@
 easyblock = 'CMakeMake'
 
 name = 'tesseract'
-version = '5.3.0'
+version = '5.5.0'
+_tessdata_ver = '4.1.0'
 
 homepage = 'https://github.com/tesseract-ocr/tesseract'
 description = """Tesseract is an optical character recognition engine"""
 
-toolchain = {'name': 'GCCcore', 'version': '11.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 github_account = 'tesseract-ocr'
 source_urls = [GITHUB_SOURCE]
@@ -14,41 +15,41 @@ sources = [
     '%(version)s.tar.gz',
     {
         'source_urls': ['https://github.com/tesseract-ocr/tessdata_best/archive/'],
-        'download_filename': '4.1.0.tar.gz',
-        'filename': 'tessdata_best-4.1.0.tar.gz',
+        'download_filename': '%s.tar.gz' % _tessdata_ver,
+        'filename': 'tessdata_best-%s.tar.gz' % _tessdata_ver,
     },
 ]
 checksums = [
-    {'5.3.0.tar.gz': '7e70870f8341e5ea228af2836ce79a36eefa11b01b56177b4a8997f330c014b8'},
+    {'5.5.0.tar.gz': 'f2fb34ca035b6d087a42875a35a7a5c4155fa9979c6132365b1e5a28ebc3fc11'},
     {'tessdata_best-4.1.0.tar.gz': 'bb05b738298ae73e7130e2913ed002b49d94cd1cea508e63be1928fe47770b32'},
 ]
 
 builddependencies = [
-    ('CMake', '3.24.3'),
-    ('binutils', '2.38'),
-    ('pkgconf', '1.8.0')
+    ('CMake', '3.29.3'),
+    ('binutils', '2.42'),
+    ('pkgconf', '2.2.0'),
 ]
 
 dependencies = [
-    ('zlib', '1.2.12'),
-    ('libpng', '1.6.37'),
-    ('libjpeg-turbo', '2.1.3'),
-    ('LibTIFF', '4.3.0'),
-    ('Leptonica', '1.83.0'),
-    ('libarchive', '3.6.1'),
-    ('ICU', '71.1'),
-    ('fontconfig', '2.14.0'),
-    ('GLib', '2.72.1'),
-    ('cairo', '1.17.4'),
-    ('Pango', '1.50.7'),
+    ('zlib', '1.3.1'),
+    ('libpng', '1.6.43'),
+    ('libjpeg-turbo', '3.0.1'),
+    ('LibTIFF', '4.6.0'),
+    ('Leptonica', '1.85.0'),
+    ('libarchive', '3.7.4'),
+    ('ICU', '75.1'),
+    ('fontconfig', '2.15.0'),
+    ('GLib', '2.80.4'),
+    ('cairo', '1.18.0'),
+    ('Pango', '1.54.0'),
 ]
 
 configopts = ['-DBUILD_SHARED_LIBS=ON', '-DBUILD_SHARED_LIBS=OFF']
 
 postinstallcmds = [
-    'rm %(builddir)s/tessdata_best-4.1.0/configs',
-    'rm -rf %(builddir)s/tessdata_best-4.1.0/tessconfigs',
-    'mv %(builddir)s/tessdata_best-4.1.0/* %(installdir)s/share/tessdata'
+    'rm %(builddir)s/tessdata_best-*/configs',
+    'rm -rf %(builddir)s/tessdata_best-*/tessconfigs',
+    'mv %(builddir)s/tessdata_best-*/* %(installdir)s/share/tessdata'
 ]
 
 modextrapaths = {
Diff against tesseract-4.1.0-GCCcore-10.3.0.eb

easybuild/easyconfigs/t/tesseract/tesseract-4.1.0-GCCcore-10.3.0.eb

diff --git a/easybuild/easyconfigs/t/tesseract/tesseract-4.1.0-GCCcore-10.3.0.eb b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
index f2343bc257..edd4d41c4f 100644
--- a/easybuild/easyconfigs/t/tesseract/tesseract-4.1.0-GCCcore-10.3.0.eb
+++ b/easybuild/easyconfigs/t/tesseract/tesseract-5.5.0-GCCcore-13.3.0.eb
@@ -1,64 +1,66 @@
 easyblock = 'CMakeMake'
 
 name = 'tesseract'
-version = '4.1.0'
+version = '5.5.0'
+_tessdata_ver = '4.1.0'
 
 homepage = 'https://github.com/tesseract-ocr/tesseract'
 description = """Tesseract is an optical character recognition engine"""
 
-toolchain = {'name': 'GCCcore', 'version': '10.3.0'}
+toolchain = {'name': 'GCCcore', 'version': '13.3.0'}
 
 github_account = 'tesseract-ocr'
 source_urls = [GITHUB_SOURCE]
 sources = [
     '%(version)s.tar.gz',
     {
-        'source_urls': ['https://github.com/tesseract-ocr/tessdata/archive/'],
-        'download_filename': '4.1.0.tar.gz',
-        'filename': 'tessdata-4.1.0.tar.gz',
+        'source_urls': ['https://github.com/tesseract-ocr/tessdata_best/archive/'],
+        'download_filename': '%s.tar.gz' % _tessdata_ver,
+        'filename': 'tessdata_best-%s.tar.gz' % _tessdata_ver,
     },
 ]
-patches = ['tesseract-4.1.0-add-glib-dependency.patch']
 checksums = [
-    '5c5ed5f1a76888dc57a83704f24ae02f8319849f5c4cf19d254296978a1a1961',  # 4.1.0.tar.gz
-    '990fffb9b7a9b52dc9a2d053a9ef6852ca2b72bd8dfb22988b0b990a700fd3c7',  # tessdata-4.1.0.tar.gz
-    'f21ac5ed7d28a07978a6f7230fce2125d98a7264e33ddd3bd648af6da41b6fd1',  # tesseract-4.1.0-add-glib-dependency.patch
+    {'5.5.0.tar.gz': 'f2fb34ca035b6d087a42875a35a7a5c4155fa9979c6132365b1e5a28ebc3fc11'},
+    {'tessdata_best-4.1.0.tar.gz': 'bb05b738298ae73e7130e2913ed002b49d94cd1cea508e63be1928fe47770b32'},
 ]
 
 builddependencies = [
-    ('CMake', '3.20.1'),
-    ('binutils', '2.36.1'),
-    ('pkg-config', '0.29.2')
+    ('CMake', '3.29.3'),
+    ('binutils', '2.42'),
+    ('pkgconf', '2.2.0'),
 ]
 
 dependencies = [
-    ('zlib', '1.2.11'),
-    ('libpng', '1.6.37'),
-    ('libjpeg-turbo', '2.0.6'),
-    ('LibTIFF', '4.2.0'),
-    ('Leptonica', '1.82.0'),
-    ('libarchive', '3.5.1'),
-    ('ICU', '69.1'),
-    ('fontconfig', '2.13.93'),
-    ('GLib', '2.68.2'),
-    ('cairo', '1.16.0'),
-    ('Pango', '1.48.5'),
+    ('zlib', '1.3.1'),
+    ('libpng', '1.6.43'),
+    ('libjpeg-turbo', '3.0.1'),
+    ('LibTIFF', '4.6.0'),
+    ('Leptonica', '1.85.0'),
+    ('libarchive', '3.7.4'),
+    ('ICU', '75.1'),
+    ('fontconfig', '2.15.0'),
+    ('GLib', '2.80.4'),
+    ('cairo', '1.18.0'),
+    ('Pango', '1.54.0'),
 ]
 
-separate_build_dir = True
+configopts = ['-DBUILD_SHARED_LIBS=ON', '-DBUILD_SHARED_LIBS=OFF']
 
 postinstallcmds = [
-    'mkdir %(installdir)s/tessdata',
-    'mv %(builddir)s/tessdata-4.1.0/* %(installdir)s/tessdata'
+    'rm %(builddir)s/tessdata_best-*/configs',
+    'rm -rf %(builddir)s/tessdata_best-*/tessconfigs',
+    'mv %(builddir)s/tessdata_best-*/* %(installdir)s/share/tessdata'
 ]
 
 modextrapaths = {
-    'TESSDATA_PREFIX': 'tessdata',
+    'TESSDATA_PREFIX': 'share/tessdata',
 }
 
 sanity_check_paths = {
-    'files': ['bin/tesseract', 'lib/libtesseract.%s' % SHLIB_EXT],
-    'dirs': ['tessdata', 'include/tesseract']
+    'files': ['bin/tesseract', 'lib/libtesseract.a', 'lib/libtesseract.%s' % SHLIB_EXT],
+    'dirs': ['share/tessdata', 'include/tesseract']
 }
 
+sanity_check_commands = ['tesseract --version', 'tesseract --list-langs']
+
 moduleclass = 'vis'

@pavelToman
Copy link
Collaborator Author

@boegelbot please test @ jsc-zen3

@boegelbot
Copy link
Collaborator

@pavelToman: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=22318 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_22318 --ntasks=8 ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

  • exit code: 0
  • output:
Submitted batch job 5733

Test results coming soon (I hope)...

- notification for comment with ID 2663118562 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

@boegelbot
Copy link
Collaborator

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.5, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.21
See https://gist.github.com/boegelbot/d13e5a09db0b9b21d686a80c35a4b853 for a full test report.

@pavelToman
Copy link
Collaborator Author

Test report by @pavelToman
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node3101.skitty.os - Linux RHEL 9.4, x86_64, Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz, Python 3.9.18
See https://gist.github.com/pavelToman/f2d1d6fc096dc52ed3d47c6a510787ee for a full test report.

@pavelToman
Copy link
Collaborator Author

Test report by @pavelToman
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node4007.donphan.os - Linux RHEL 8.8, x86_64, Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz, 1 x NVIDIA NVIDIA A2, 545.23.08, Python 3.6.8
See https://gist.github.com/pavelToman/fae7e53bd185f589d69cf69f9122d686 for a full test report.

@pavelToman
Copy link
Collaborator Author

Test report by @pavelToman
SUCCESS
Build succeeded for 2 out of 2 (2 easyconfigs in total)
node4205.shinx.os - Linux RHEL 9.4, x86_64, AMD EPYC 9654 96-Core Processor, Python 3.9.18
See https://gist.github.com/pavelToman/f78214f206140f328994b29cc313c012 for a full test report.

@smoors smoors added this to the release after 4.9.4 milestone Feb 18, 2025
Copy link
Contributor

@smoors smoors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@smoors
Copy link
Contributor

smoors commented Feb 18, 2025

Going in, thanks @pavelToman!

@smoors smoors merged commit 8109526 into easybuilders:develop Feb 18, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tesseract-OCR
3 participants