(bisected) GPU crash when launching games since kernel 6.5-rc1
Brief summary of the problem:
Since Linux kernel 6.5-rc1, GPU will crash when launching games.
The crash happens between a couple of seconds to less than a minute after the game is started.
Kernel version 6.4.0 was OK. I bisected the issue, pointing to this commit:
commit 50a7c8765ca69543ffdbf855de0fd69aea769ccf
Author: Alex Deucher <alexander.deucher@amd.com>
Date: Fri Jun 16 17:07:53 2023 -0400
drm/amdgpu: enable mcbp by default on gfx9
It's required for high priority queues.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2535
Reviewed-and-tested-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reverting this commit on top of 6.5-rc2 fixes the issue.
Hardware description:
- CPU:
AMD Ryzen 7 5700G with Radeon Graphics
- GPU:
05:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] [1002:1638] (rev c8)
- System Memory: 32 GB
- Display(s): Dual screens
- Type of Display Connection: One display using DisplayPort, the other using HDMI.
System information:
- Distro name and Version: Debian Sid
- Kernel version: 6.5-rc2
- Custom kernel: Custom config config-6.5.0-rc2
- AMD official driver version: N/A
How to reproduce the issue:
I see the issue in at least Shadow of the Tomb Raider
and House Flipper
. I did not test more games than those two.
To reproduce:
- start game
- wait up to one minute
- GPU will crash
Attached files:
Log files (for system lockups / game freezes / crashes)
Kernel log showing crash in 6.5-rc2 with game Shadow of the Tomb Raider: