amdgpu_pm: reading from hwmon inputs causes GPU to resume from runpm suspend
Example trace:
[ 42.218074] PSP resume!
[ 42.218086] WARNING: CPU: 7 PID: 4903 at drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c:2747 psp_resume+0x2d/0x238 [amdgpu]
[ 42.218315] Call Trace:
[ 42.218317] <TASK>
[ 42.218319] amdgpu_device_fw_loading+0x74/0x140 [amdgpu]
[ 42.218368] ? pci_pm_restore_noirq+0xc0/0xc0
[ 42.218371] ? pci_pm_restore_noirq+0xc0/0xc0
[ 42.218372] amdgpu_device_resume+0x112/0x2c0 [amdgpu]
[ 42.218420] amdgpu_pmops_runtime_resume+0x80/0xf0 [amdgpu]
[ 42.218466] __rpm_callback+0x44/0x120
[ 42.218469] ? pci_pm_restore_noirq+0xc0/0xc0
[ 42.218470] rpm_callback+0x5d/0x70
[ 42.218471] rpm_resume+0x52c/0x7d0
[ 42.218473] __pm_runtime_resume+0x4a/0x80
[ 42.218475] amdgpu_hwmon_show_vddgfx+0x62/0x100 [amdgpu]
[ 42.218558] dev_attr_show+0x19/0x40
[ 42.218560] sysfs_kf_seq_show+0x99/0x100
[ 42.218562] seq_read_iter+0x120/0x4b0
[ 42.218564] ? selinux_file_permission+0x108/0x150
[ 42.218566] vfs_read+0x204/0x2d0
[ 42.218568] ksys_read+0x5f/0xe0
[ 42.218570] do_syscall_64+0x3b/0x90
[ 42.218571] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Querying the current vdd of a device ideally should not cause a resume. If it's not available in runpm it would be better to fail in some way.
The current approach means that running power monitoring software which frequently queries power state will resume and suspend the GPU on a loop. You try to check out your power usage and make it worse by doing so!
The current code has a number of checks for in_suspend
however they deliberately exclude runpm. These checks were introduced in d2ae842d24625756fb7ac5440335ed2973463b7d drm/amdgpu/pm: bail on sysfs/debugfs queries during platform suspend
.
I can trigger this reliably by running sensors
from lm_sensors
or by manually reading the hwmon inputs for a card.