Running vkoverhead results in device_lost/gpu hang on dg2
System information
- OS: Arch
- GPU: a750
- Kernel version: 6.0.2-arch1-1 (standard arch kernel) and 6.0.0-rc2-1drm-intel-next ( https://cgit.freedesktop.org/drm/drm-intel/log/?h=drm-intel-next this kernel )
- Mesa version: 22.2.1
Describe the issue
Running just the descriptor tests on either kernel results in the gpu usually hanging and rendering a sway environment permanently stuck, though the drm-intel-next kernel lasts almost to the end of the suite.
If you attempt to run the full suite of vkoverhead test vulkan_device_lost
is reported during the 38th test (submit_1cmdbuf) consistently. Running the submit tests only without the draw tests completes as expected instead of reporting device_lost.
Regression
I dont know if it worked on earlier versions.
Log files as attachment
- Output of
dmesg
- There was no i915 or drm output in dmesg for the vulkan_device_lost so I assume it is mesa related and not kernel related. The gpu hangs for both kernels look similar so ill only include the standard kernel unless there is interest in the other one. - Gpu hang details - GPU does not hang