Skip to content

unitsync: link generic cpu_topology instead of per-platform detection#3049

Closed
tomjn wants to merge 1 commit into
beyond-all-reason:masterfrom
tomjn:decouple-unitsync-cpu-topology
Closed

unitsync: link generic cpu_topology instead of per-platform detection#3049
tomjn wants to merge 1 commit into
beyond-all-reason:masterfrom
tomjn:decouple-unitsync-cpu-topology

Conversation

@tomjn

@tomjn tomjn commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

What

unitsync links a generic, platform-agnostic cpu_topology implementation instead of the per-platform Platform/{Linux,Win,Mac}/CpuTopology.cpp, removing unitsync's dependency on real CPU topology detection on all platforms.

This is an alternative to #3042's "add Mac/CpuTopology.cpp to unitsync" — it instead makes unitsync stop needing per-platform topology at all, which is what @sprunk asked for on that PR.

Why

unitsync wraps FileSystemInitializer::Initialize() in the thread pool so archive scanning/hashing parallelizes via for_mt, so it genuinely needs the pool. It does not pin threads, so it does not need real CPU topology (P/E core masks, cache groups, pin policy).

The dependency was incidental: CPUID::EnumerateCores() (called from the CPUID singleton ctor) eagerly calls cpu_topology::GetProcessorMasks() + GetProcessorCache(), and Threading::GetChosenThreadPinPolicy() calls cpu_topology::GetThreadPinPolicy(). Those are link-time references in TUs unitsync compiles, so it had to link a per-platform CpuTopology.cpp even though it never pins.

Approach

A generic CpuTopologyGeneric.cpp reports every logical core as a single group of performance cores (so CPUID::EnumerateCores() still derives a correct logical-core count for the pool) and requests THREAD_PIN_POLICY_NONE. unitsync links that instead of the per-platform file. The change is contained to tools/unitsync/CMakeLists.txt + the new file; no shared engine CMake or threading logic is touched, so the engine/dedicated server keep linking the real per-platform topology for sim-worker pinning.

Note on the "make CPUID lazy" alternative

An earlier idea was to make CPUID::EnumerateCores() compute topology lazily so the core-count path wouldn't pull cpu_topology. That does not work for unitsync: it's a SHARED library and on macOS/Windows the build sets no -fvisibility=hidden (gated to Linux in TestCXXFlags.cmake), so the Threading/CpuTopologyCommon free functions that reference the symbols stay exported and can't be dead-stripped. Deferring when a call runs doesn't remove a link-time reference. So a generic impl is the approach that actually drops the per-platform file.

This satisfies the dependency with a trivial stub rather than eliminating the symbol references entirely; a literal "drop" would require #ifdef UNITSYNC stubs through CpuID.cpp/Threading.cpp/CpuTopologyCommon.cpp (shared core files). Happy to go that route instead if preferred.

Verification

Built unitsync on macOS (Homebrew GCC 16):

  • Links cleanly.
  • Link line contains CpuTopologyGeneric.cpp.o, not Mac/CpuTopology.cpp.o.
  • nm on libunitsync.dylib: zero undefined cpu_topology symbols; none of the real platform internals (detect_cpu_vendor, collect_intel, get_thread_siblings) are linked.

Engine targets are untouched in their source wiring, so they keep the real topology.

Refs #3042.

Assisted by Claude Code; verified by building on macOS.

unitsync wraps FileSystemInitializer::Initialize() in the thread pool so
archive scanning/hashing parallelizes via for_mt, so it genuinely needs
the pool. It does not, however, pin threads, so it does not need real CPU
topology detection (P/E core masks, cache groups, pin policy).

The topology dependency was incidental: CPUID::EnumerateCores() (called
from the CPUID singleton ctor) eagerly calls cpu_topology::GetProcessorMasks()
and GetProcessorCache(), and Threading::GetChosenThreadPinPolicy() calls
cpu_topology::GetThreadPinPolicy(). These are link-time references in TUs
unitsync compiles, so it had to link Platform/{Linux,Win,Mac}/CpuTopology.cpp
even though it never pins.

Provide a generic, platform-agnostic cpu_topology implementation
(CpuTopologyGeneric.cpp) that reports every logical core as a single group
of performance cores and requests no pinning, and link that into unitsync
instead of the per-platform file. This removes the per-platform CpuTopology
from unitsync on all platforms (notably Mac/CpuTopology.cpp), making the
build consistent across Linux/Windows/macOS.

The engine and dedicated server are untouched and keep linking the real
per-platform topology for sim-worker pinning.

@sprunk sprunk left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out the unitsync use of threading is reasonable so I think the original approach in that other PR (to actually add the proper mac cpu topology) is better after all.

@sprunk sprunk added the big mac Part of the big push to support Mac. label Jun 22, 2026
@tomjn tomjn closed this Jun 25, 2026
@tomjn

tomjn commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Closed in favor of #3042

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

big mac Part of the big push to support Mac.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants