-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
When I tried to install an agama build driver, I found that the clinfo -l could get the gpu device but xpu-smi discovery can't find it.
The root cause is that the hwmon file changed.
# Only exists:
/sys/class/hwmon/hwmon3/energy2_input
# Does NOT exist (xpu-smi looks for this):
/sys/class/hwmon/hwmon3/energy1_input
So the return value is empty. I did the change like this and the issue is fixed. Not very sure this is the right fix, but put here as a reference:
cd ~/xpumanager && git diff V1.3.6 -- core/src/device/gpu/gpu_device_stub.cpp
diff --git a/core/src/device/gpu/gpu_device_stub.cpp b/core/src/device/gpu/gpu_device_stub.cpp
index 258ebab..f5e5334 100644
--- a/core/src/device/gpu/gpu_device_stub.cpp
+++ b/core/src/device/gpu/gpu_device_stub.cpp
@@ -224,8 +224,14 @@ std::shared_ptr<MeasurementData> GPUDeviceStub::loadPVCIdlePowers(std::string bd
std::string name = getFileValue("/sys/class/hwmon/" + std::string(pdirent->d_name) +"/name");
name.erase(0, name.find_first_not_of(" \n\r\t"));
name.erase(name.find_last_not_of(" \n\r\t") + 1);
- auto energy_path = "/sys/class/hwmon/" + std::string(pdirent->d_name) +"/energy1_input";
- uint64_t value = std::stoull(getFileValue(energy_path));
+ // xe driver (kernel >= 6.8) uses energy2_input; i915 uses energy1_input
+ std::string energy_path = "/sys/class/hwmon/" + std::string(pdirent->d_name) + "/energy1_input";
+ if (access(energy_path.c_str(), F_OK) != 0)
+ energy_path = "/sys/class/hwmon/" + std::string(pdirent->d_name) + "/energy2_input";
+ std::string energy_str = getFileValue(energy_path);
+ if (energy_str.empty())
+ continue;
+ uint64_t value = std::stoull(energy_str);
auto timestamp = Utility::getCurrentMillisecond();
XPUM_LOG_TRACE("[{}] path:{}, value: {}, timestamp: {}", gpu_bdf, energy_path, value, timestamp);
if (pvc_idle_powers.count(gpu_bdf) == 0)Environment
Hardware Intel Data Center GPU Max 1550 (8086:0BD5)
OS Ubuntu 24.04
Kernel Linux 984fee015d7d.jf.intel.com 6.18.0-rc2+prerelease3000+ #1 SMP PREEMPT_DYNAMIC Sun Oct 26 04:57:21 PDT 2025 x86_64 x86_64 x86_64 GNU/Linux
Level Zero 1.28.0.0
xpu-smi version 1.3.6
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels