Power draw and temperature information is not available for Arc GPUs
I have an Intel Arc A770 16GB Limited Edition. I've been using it since the beginning of the year, and as far as I can tell, power draw (in Watts) and temperature information is not exposed anywhere, and that is basic and essential information needed for proper monitoring. I verified that this is still the case for the 6.6.0-rc5DRM-TIP-g186ff606ce50 drm-tip kernel
.
lm_sensors returns
i915-pci-0300
Adapter: PCI adapter
in0: 0.00 V
power1: N/A (max = 190.00 W)
energy1: 14.95 kJ
I assume that power1
is supposed to show the power draw in Watts, but it has always shown N/A
no matter the load I've put on the card. Even assuming that this is an issue with lm_sensors rather than the driver itself, I can't find a file under /sys/class/drm/card1/device/hwmon/hwmon2
(the A770) that exposes real time power draw information.
Now, I'we written a bash script to compute the power draw (in Watts) from the energy consumption data (in Joules) exposed by /sys/class/drm/card1/device/hwmon/hwmon2/energy1_input
, and so have others online, each of us with our own script, but I think that it makes more sense for power draw information to be exposed directly and be available for lm_sensors and such.
More importantly, there is no way to get the current temperature of the card. The only temperature related sysfs files I've seen are /sys/class/drm/card1/gt/gt0/throttle_reason_vr_thermalert
, /sys/class/drm/card1/gt/gt0/throttle_reason_thermal
, and /sys/class/drm/card1/gt/gt0/throttle_reason_vr_tdc
.
Knowing that that card got hot enough to get throttled it better than not knowing, but that's not enough, we need proper temperature information for the card, like what is the current temperature of the card and probably what's the temperature at which the card starts throttling, etc.
Is it possible to make power draw and temperature information available for Arc cards? Thank you!