Skip to content

GH-131521: fix clangcl build on Windows for zlib-ng #131526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 24, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 40 additions & 13 deletions PCbuild/zlib-ng.vcxproj
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,15 @@
<ItemDefinitionGroup>
<ClCompile>
<AdditionalOptions>%(AdditionalOptions) /utf-8 /w34242</AdditionalOptions>
<AdditionalOptions Condition="$(SupportPGO) and $(Configuration) == 'PGUpdate' and $(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -fno-profile-instr-use</AdditionalOptions>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since zlib-ng is compiled into a static lib, it won't produce any profile data. It is the first target that is built (after the _freeze_module)

<ItemGroup>
<!-- Static libraries for use later in the build -->
<Projects Include="zlib-ng.vcxproj" Condition="$(zlibNgDir) != '' and Exists('$(zlibNgDir)\zlib-ng.h.in')" />

But since there is no \$(TargetName)_*.profraw for it
<ItemGroup>
<_profrawFiles Include="$(OutDir)instrumented\$(TargetName)_*.profraw" />
</ItemGroup>

MergeClangProfileData won't create profdata.profdata:
<Target Name="MergeClangProfileData" BeforeTargets="PrepareForBuild"
Condition="'$(SupportPGO)' and $(Configuration) == 'PGUpdate'"
Inputs="@(_profrawFiles)"

IMHO, this is the best fix (instead of just setting SupportPGO=false for it):

  • it still will get compiled with -fprofile-instr-generate
  • it will be linked with python314.dll and (hopefully?) happily contribute its part to python314.profraw
  • MSVC doesn't create zlib-ng!1.pgc and zlib-ng.pgd either. No idea whether it gets PGOed there, but the same idea applies? The difference is just, that MSVC does not produce a hard error if the profile data isn't there, whereas clang does (TRACKEDEXEC : error : Error in reading profile profdata.profdata: no such file or directory), and I still have found no way to make that a warning :(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since zlib-ng is compiled into a static lib, it won't produce any profile data.

Thinking about it again, instead of doing the -fno-profile-instr-use trick in the PGUpdate phase, it would be better, to just create the lib once instead of twice (like I've recently done for the _freeze_module).
There is no need to build the lib twice (same holds true for the MSVC case)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a quick try:

  • introducing a StaticLibProjects like FreezeProjects in pcbuild.proj is simple and straight forward.
  • Then I'd have to make the ProjectReference Include="zlib-ng.vcxproj conditional (otherwise it would still be built in the PGUpdate phase) - ok, doable.
  • but then zlib.h is no longer found, because it is generated into $(IntDir)zlib.h - likewise $(IntDir)zlib-ng.h. So stopping here :)

So turns out to be not so easy and I'd like to keep that for a follow-up PR (if at all).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed the call counts for PGO are going straight into each project that references it, since a static lib is essentially just a bundle of .obj files and so the link-time optimisation happens in the referencing project, not the static lib project.

Which means yes, it's participating in PGO, and it doesn't require rebuilding. Though it's not going to hurt that badly, so I'd tend towards decoupled build configuration (it would be easy to fix by moving the header file regeneration into pythoncore.vcxproj, but that's bad coupling).

Possibly we could make SupportsPGO choose the definition of IntDir and then it would usually be a quick rebuild?

Copy link
Member Author

@chris-eibl chris-eibl Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but let's do that in a follow-up PR, since it is not really related to clang-cl.

Technically, the /arch changes so that the resulting binary can run on legacy CPUs aren't either, but IMHO they are a nice fit here.

It would just put some rebase burden onto this PR, if we'd do it in a separate one.

<DisableSpecificWarnings>4206;4054;4324</DisableSpecificWarnings>
<LanguageStandard_C>stdc11</LanguageStandard_C>
<PrecompiledHeader>NotUsing</PrecompiledHeader>
<AdditionalIncludeDirectories>$(zlibNgDir);$(PySourceDir)PC;$(GeneratedZlibNgDir);%(AdditionalIncludeDirectories)</AdditionalIncludeDirectories>
<PreprocessorDefinitions>%(PreprocessorDefinitions);ZLIB_COMPAT;WITH_GZFILEOP;NO_FSEEKO;HAVE_BUILTIN_ASSUME_ALIGNED;_CRT_SECURE_NO_DEPRECATE;_CRT_NONSTDC_NO_DEPRECATE;</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="$(Platform) == 'Win32' or $(Platform) == 'x64'">%(PreprocessorDefinitions);X86_FEATURES;X86_HAVE_XSAVE_INTRIN;X86_SSE2;X86_SSSE3;X86_SSE42;X86_PCLMULQDQ_CRC;X86_AVX2;X86_AVX512;X86_AVX512VNNI;X86_VPCLMULQDQ_CRC</PreprocessorDefinitions>
<PreprocessorDefinitions Condition="$(Configuration) == 'Debug'">%(PreprocessorDefinitions);ZLIB_DEBUG</PreprocessorDefinitions>
<EnableEnhancedInstructionSet Condition="$(Platform) == 'Win32' or $(Platform) == 'x64'">AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
<PreprocessorDefinitions Condition="$(PlatformToolset) == 'ClangCL'">%(PreprocessorDefinitions);HAVE_BUILTIN_CTZ</PreprocessorDefinitions>
</ClCompile>
</ItemDefinitionGroup>
<ItemGroup>
Expand Down Expand Up @@ -141,18 +142,44 @@
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_sse2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\compare256_sse2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\slide_hash_sse2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_ssse3.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_ssse3.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_sse42.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\crc32_pclmulqdq.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\slide_hash_avx2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_avx2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\compare256_avx2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx2.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx512.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_avx512.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx512_vnni.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\crc32_vpclmulqdq.c" />
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_ssse3.c">
<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -mssse3</AdditionalOptions>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_ssse3.c">
<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -mssse3</AdditionalOptions>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_sse42.c">
Copy link
Member Author

@chris-eibl chris-eibl Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting: the filename suggests sse43 sse42, but -mssse3 deemed sufficient.
Ups: edited typo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth matching the filename, so that if an updated zlib-ng uses SSE4.3 instructions we don't need to modify anything here.

The file relies on intrinsics, so it's not going to automatically generate newer instructions.

<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -msse4.2</AdditionalOptions>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\crc32_pclmulqdq.c">
<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -mssse3 -mpclmul</AdditionalOptions>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\slide_hash_avx2.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_avx2.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\compare256_avx2.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx2.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions2</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx512.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions512</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\chunkset_avx512.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions512</EnableEnhancedInstructionSet>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\adler32_avx512_vnni.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions512</EnableEnhancedInstructionSet>
<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -mavx512vnni</AdditionalOptions>
</ClCompile>
<ClCompile Include="$(zlibNgDir)\arch\x86\crc32_vpclmulqdq.c">
<EnableEnhancedInstructionSet>AdvancedVectorExtensions512</EnableEnhancedInstructionSet>
<AdditionalOptions Condition="$(PlatformToolset) == 'ClangCL'">%(AdditionalOptions) -mvpclmulqdq</AdditionalOptions>
</ClCompile>
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\PC\zconf.h" />
Expand Down
Loading