x86,fs/resctrl: Remove inappropriate references to cacheinfo in the r…#90
Open
beckwen wants to merge 1 commit intoopenvelinux:6.6-velinuxfrom
Open
x86,fs/resctrl: Remove inappropriate references to cacheinfo in the r…#90beckwen wants to merge 1 commit intoopenvelinux:6.6-velinuxfrom
beckwen wants to merge 1 commit intoopenvelinux:6.6-velinuxfrom
Conversation
…esctrl subsystem commit 594902c986e269660302f09df9ec4bf1cf017b77 upstream. In the resctrl subsystem's Sub-NUMA Cluster (SNC) mode, the rdt_mon_domain structure representing a NUMA node relies on the cacheinfo interface (rdt_mon_domain::ci) to store L3 cache information (e.g., shared_cpu_map) for monitoring. The L3 cache information of a SNC NUMA node determines which domains are summed for the "top level" L3-scoped events. rdt_mon_domain::ci is initialized using the first online CPU of a NUMA node. When this CPU goes offline, its shared_cpu_map is cleared to contain only the offline CPU itself. Subsequently, attempting to read counters via smp_call_on_cpu(offline_cpu) fails (and error ignored), returning zero values for "top-level events" without any error indication. Replace the cacheinfo references in struct rdt_mon_domain and struct rmid_read with the cacheinfo ID (a unique identifier for the L3 cache). rdt_domain_hdr::cpu_mask contains the online CPUs associated with that domain. When reading "top-level events", select a CPU from rdt_domain_hdr::cpu_mask and utilize its L3 shared_cpu_map to determine valid CPUs for reading RMID counter via the MSR interface. Considering all CPUs associated with the L3 cache improves the chances of picking a housekeeping CPU on which the counter reading work can be queued, avoiding an unnecessary IPI. Intel-SIG: commit 594902c986e x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem backport to RDT driver for CWF Fixes: 328ea68 ("x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files") Signed-off-by: Qinyun Tan <qinyuntan@linux.alibaba.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/20250530182053.37502-2-qinyuntan@linux.alibaba.com Signed-off-by: Kui Wen <kui.wen@intel.com>
6b87ae9 to
83bef8f
Compare
|
please also backport commit d2e1b84c5141ff2ad465279acfc3cf943c960b78("fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters") which is a fix patch to commit 594902c986e269660302f09df9ec4bf1cf017b77 |
|
Please ensure that the Date and author information of the patch are consistent with the content of the upstream patch: Author: Qinyun Tan qinyuntan@linux.alibaba.com not: |
|
missing d2e1b84c5141ff2ad465279acfc3cf943c960b78 fs/resctrl: Eliminate false positive lockdep warning when reading SNC counters. This commit is a bugfix for "x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…esctrl subsystem
commit 594902c986e269660302f09df9ec4bf1cf017b77 upstream.
In the resctrl subsystem's Sub-NUMA Cluster (SNC) mode, the rdt_mon_domain structure representing a NUMA node relies on the cacheinfo interface (rdt_mon_domain::ci) to store L3 cache information (e.g., shared_cpu_map) for monitoring. The L3 cache information of a SNC NUMA node determines which domains are summed for the "top level" L3-scoped events.
rdt_mon_domain::ci is initialized using the first online CPU of a NUMA node. When this CPU goes offline, its shared_cpu_map is cleared to contain only the offline CPU itself. Subsequently, attempting to read counters via smp_call_on_cpu(offline_cpu) fails (and error ignored), returning zero values for "top-level events" without any error indication.
Replace the cacheinfo references in struct rdt_mon_domain and struct rmid_read with the cacheinfo ID (a unique identifier for the L3 cache).
rdt_domain_hdr::cpu_mask contains the online CPUs associated with that domain. When reading "top-level events", select a CPU from rdt_domain_hdr::cpu_mask and utilize its L3 shared_cpu_map to determine valid CPUs for reading RMID counter via the MSR interface.
Considering all CPUs associated with the L3 cache improves the chances of picking a housekeeping CPU on which the counter reading work can be queued, avoiding an unnecessary IPI.
Intel-SIG: commit 594902c986e
x86,fs/resctrl: Remove inappropriate references to cacheinfo in the resctrl subsystem backport to RDT driver for CWF
Test case: run the tool under kernel tools/testing/selftests/resctrl
./resctrl_tests
Fixes: 328ea68 ("x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files")
Reviewed-by: Reinette Chatre reinette.chatre@intel.com
Tested-by: Tony Luck tony.luck@intel.com
Link: https://lore.kernel.org/20250530182053.37502-2-qinyuntan@linux.alibaba.com