Update hwloc CPU Binding Implementation#1226
Draft
bcmIntc wants to merge 3 commits intoSandia-OpenSHMEM:mainfrom
Draft
Update hwloc CPU Binding Implementation#1226bcmIntc wants to merge 3 commits intoSandia-OpenSHMEM:mainfrom
bcmIntc wants to merge 3 commits intoSandia-OpenSHMEM:mainfrom
Conversation
- Fix hwloc_bitmap_copy called with hwloc_obj_t instead of hwloc_bitmap_t (covering_obj -> covering_obj->cpuset) in both NUMA and socket paths - Add return value check for hwloc_get_proc_cpubind with goto hwloc_cleanup on failure to avoid operating on uninitialized bitmap data - Remove unused variables numa_node_id and socket_id - Removed #if 0's used to maintain the original code
a1feea1 to
4fe8faa
Compare
- Only call hwloc_set_proc_cpubind() when the process affinity spans multiple NUMA nodes. If the binding already fits within a single NUMA node (e.g. set by the job launcher), leave it untouched to avoid inadvertently broadening a tight affinity mask. - Add SHMEM_DISABLE_CPU_BINDING env var to skip all hwloc CPU binding during shmem_init(), giving users full control over process affinity. - Remove leftover debug print_bitmap_hex() function and printf calls.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
This PR replaces the previous compile-time-gated hwloc CPU binding in
shmem_internal_heap_postinit()with an always-on implementation (iff hwloc enabled) that respects user affinity.New implementation
Replaced the
HWLOC_ENFORCE_SINGLE_SOCKET/HWLOC_ENFORCE_SINGLE_NUMA_NODEcompile-time guards with unconditional binding logic (when built withUSE_HWLOC). The new code queries the PE's current CPU affinity, attempts to find a covering NUMA node first, and falls back to the covering socket if the affinity spans multiple NUMA nodes.Bug fixes
hwloc_bitmap_copybeing called withhwloc_obj_tinstead ofhwloc_obj_t->cpusetin both the NUMA and socket paths, which silently corrupted the destination bitmap and causedhwloc_set_proc_cpubindto set an invalid binding.hwloc_get_proc_cpubindwithgoto hwloc_cleanupon failure, preventing subsequent hwloc operations from running on an uninitialized bitmap.#if 0blocks.Preserve user affinity
hwloc_set_proc_cpubind()is only called when the affinity spans multiple NUMA nodes and needs narrowing to the covering socket.SHMEM_DISABLE_CPU_BINDINGenvironment variable to skip all hwloc CPU binding at init time.Test coverage