Skip to content

gh-91349: Replace zlib with zlib-ng in Windows build #131438

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Mar 19, 2025
Merged

Conversation

zooba
Copy link
Member

@zooba zooba commented Mar 19, 2025

"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceLocator": "cpe:2.3:a:zlib:zlib:1.3.1:*:*:*:*:*:*:*",
"referenceLocator": "cpe:2.3:a:zlib-ng:zlib-ng:2.2.4:*:*:*:*:*:*:*",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I just guessed this, in order to unblock my testing (couldn't build at all with an invalid SBOM). If it's not right, let me know

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely the CPE that would get used, although we can't be for certain until a CVE exists with it... the only one I could find with some searching is cpe:...:zlib-ng:minizip-ng which is for a different component but at least the organization is a match.

@@ -2101,6 +2101,12 @@ zlib_exec(PyObject *mod)
PyUnicode_FromString(zlibVersion())) < 0) {
return -1;
}
#ifdef ZLIBNG_VERSION
if (PyModule_Add(mod, "ZLIBNG_VERSION",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Figured we'd want some way to detect this other than looking at the text in ZLIB_VERSION

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. annoying to have optional attributes that one would use hasattr or getattr on instead of blindly accessing, but realistically nobody should care as this is more internal informational so this is fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the main place it'll be used is in pythoninfo, which already handles absent attributes.

We have more offensive optional attributes in our extension modules 😆

@zooba
Copy link
Member Author

zooba commented Mar 19, 2025

I've reviewed the three header files added to PC and they seem generic. It's a shame they aren't part of the sources, but figured it's easier/safer to just copy them into our source tree so that the cpython-source-deps repo is a straight import.

@zooba zooba added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 19, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @zooba for commit 08eecb1 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F131438%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Mar 19, 2025
@zooba
Copy link
Member Author

zooba commented Mar 19, 2025

The two failing buildbots appear unrelated, and everything else passed. All I changed in the last commit was docs, so I'm not going to rerun the buildbots unless someone thinks it's necessary or worthwhile.

Copy link
Contributor

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SBOM changes LGTM!

"externalRefs": [
{
"referenceCategory": "SECURITY",
"referenceLocator": "cpe:2.3:a:zlib:zlib:1.3.1:*:*:*:*:*:*:*",
"referenceLocator": "cpe:2.3:a:zlib-ng:zlib-ng:2.2.4:*:*:*:*:*:*:*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is likely the CPE that would get used, although we can't be for certain until a CVE exists with it... the only one I could find with some searching is cpe:...:zlib-ng:minizip-ng which is for a different component but at least the organization is a match.

@mdboom
Copy link
Contributor

mdboom commented Mar 19, 2025

I'm kicking off a pyperformance benchmarking run with this -- we have a non-zero usage of zlib in that suite.

@mdboom
Copy link
Contributor

mdboom commented Mar 19, 2025

The results for this on pyperformance are basically within the noise / no change. Not all that surprising, given that there are no benchmarks specifically aimed at zlib. https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250319-3.14.0a6+-548daa7

@@ -2101,6 +2101,12 @@ zlib_exec(PyObject *mod)
PyUnicode_FromString(zlibVersion())) < 0) {
return -1;
}
#ifdef ZLIBNG_VERSION
if (PyModule_Add(mod, "ZLIBNG_VERSION",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense. annoying to have optional attributes that one would use hasattr or getattr on instead of blindly accessing, but realistically nobody should care as this is more internal informational so this is fine.

@zooba zooba enabled auto-merge (squash) March 19, 2025 18:38
@zooba zooba merged commit 63a638c into python:main Mar 19, 2025
42 checks passed
@zooba zooba deleted the gh-91349 branch March 19, 2025 19:25
@chris-eibl
Copy link
Member

This has broken the tailcall builds on Windows: https://github.com/python/cpython/actions/runs/13967899073/job/39102464260

D:\a\cpython\cpython\externals\\zlib-ng-2.2.4\\functable.c(79,26): error : use of undeclared identifier 'slide_hash_sse2'; did you mean 'slide_hash_c'? [D:\a\cpython\cpython\PCbuild\zlib-ng.vcxproj]
D:\a\cpython\cpython\externals\\zlib-ng-2.2.4\\functable.c(123,26): error : use of undeclared identifier 'slide_hash_avx2'; did you mean 'slide_hash_c'? [D:\a\cpython\cpython\PCbuild\zlib-ng.vcxproj]

@zooba
Copy link
Member Author

zooba commented Mar 20, 2025

Probably needs some fixed preprocessor checks (hopefully only in what's in the PC directory) to better handle the intrinsics being present/absent.

Though I notice that build warns NOTE: Visual Studio not detected. LLVM does not provide a C/C++ standard library and may be unable to locate MSVC headers., which might also be relevant.

@chris-eibl
Copy link
Member

I think, this is because here
https://github.com/zlib-ng/zlib-ng/blob/860e4cff7917d93f54f5d7f0bc1d0e8b1a3cb988/fallback_builtins.h#L4-L25
in line 4 __clang__ gets excluded and hence #define HAVE_BUILTIN_CTZ is not set (but in the MSVC case it is).

Then, in
https://github.com/zlib-ng/zlib-ng/blob/860e4cff7917d93f54f5d7f0bc1d0e8b1a3cb988/arch/x86/x86_functions.h#L13-L20
we dont get the prototype for slide_hash_sse2 in line 17, hence the error.

Sorry, I cannot get the permalinks to work here the usual way I am doing it :(

@chris-eibl
Copy link
Member

Though I notice that build warns NOTE: Visual Studio not detected. LLVM does not provide a C/C++ standard library and may be unable to locate MSVC headers., which might also be relevant.

FWIW, when building with the bundled clang-cl of VS I do not have these warnings, but get the same error.

@zooba
Copy link
Member Author

zooba commented Mar 20, 2025

Most likely zlib-ng would generate different versions of the header files I checked into the PC directory for Clang. Figuring out what differences belong in there and detecting it dynamically rather than using CMake is the way forward.

This should get a new issue (because clang-cl isn't a supported configuration, so we don't have to revert this change until it's working). Tag me on it and we can continue there.

<PrecompiledHeader>NotUsing</PrecompiledHeader>
<AdditionalIncludeDirectories>$(zlibNgDir);$(PySourceDir)PC;$(GeneratedZlibNgDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<PreprocessorDefinitions>%(PreprocessorDefinitions);ZLIB_COMPAT;WITH_GZFILEOP;NO_FSEEKO;HAVE_BUILTIN_ASSUME_ALIGNED;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="$(Platform) == 'Win32' or $(Platform) == 'x64'">%(PreprocessorDefinitions);X86_FEATURES;X86_HAVE_XSAVE_INTRIN;X86_SSE2;X86_SSSE3;X86_SSE42;X86_PCLMULQDQ_CRC;X86_AVX2;X86_AVX512;X86_AVX512VNNI;X86_VPCLMULQDQ_CRC</PreprocessorDefinitions>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that for all executables X86_AVX512 code will be emitted and used?
See e.g. adler32_avx512.c, where the whole file is guarded with X86_AVX512 and there is code like __m512i in it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's detected and called dynamically based on availability, but it's all compiled at compile time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I hope so, too.

<PreprocessorDefinitions>%(PreprocessorDefinitions);ZLIB_COMPAT;WITH_GZFILEOP;NO_FSEEKO;HAVE_BUILTIN_ASSUME_ALIGNED;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="$(Platform) == 'Win32' or $(Platform) == 'x64'">%(PreprocessorDefinitions);X86_FEATURES;X86_HAVE_XSAVE_INTRIN;X86_SSE2;X86_SSSE3;X86_SSE42;X86_PCLMULQDQ_CRC;X86_AVX2;X86_AVX512;X86_AVX512VNNI;X86_VPCLMULQDQ_CRC</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="$(Configuration) == 'Debug'">%(PreprocessorDefinitions);ZLIB_DEBUG</PreprocessorDefinitions>
<EnableEnhancedInstructionSet Condition="$(Platform) == 'Win32' or $(Platform) == 'x64'">AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unconditionally letting the compiler generate code up to AVX2 for all *.c files here might result in the binary not running on all CPUs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants