AMD 6900XT amdgpu_dm_atomic_commit_tail Waiting for fences timed out!
Description
Initially filed here:
Please see there for dmesg and lspci output and more details.
Affects:
AMD 6700 XT, 5700 XT
Freezes:
- Ubuntu 21.10 GNOME, with Mesa 21.2.2-1ubuntu1 and kernels 5.13 and 5.15.
Freezes just using the desktop / file manager / a web browser.
- Fedora 35 KDE, with Mesa 21.2.3 and 21.2.5 and kernels 5.14.20 and 5.15.*
On Fedora, I've built kernels 5.14, 5.13, 5.12 from source. No difference.
Mostly freezes while using Android Studio and/or GoLand (both are IDE's by JetBrains).
- Manjaro Linux GNOME, kernel 5.15.2, initially the distro's Mesa 21.2.5.
On Manjaro, I've built Mesa from source hoping to bisect - uninstalled the distro's Mesa and related libraries and installed my built version to /usr (not /usr/local).
Tried versions 21.0.1, 21.03, 21.3.1 - they all freeze.
Tried this patch, did not help:
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.cpp b/src/gallium/drivers/radeonsi/si_state_draw.cpp
index 8debad2d66e..178aaf1aa1c 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.cpp
+++ b/src/gallium/drivers/radeonsi/si_state_draw.cpp
@@ -2110,8 +2110,9 @@ static void si_draw(struct pipe_context *ctx,
* direct draws.
* 'instance_count == 0' seems to be problematic on Renoir chips (#4866),
* so simplify the condition and drop these draws for all <= GFX9 chips.
+ * Update: enable the check for all GFX_VERSION values
*/
- if (GFX_VERSION <= GFX9 && unlikely(!IS_DRAW_VERTEX_STATE && !indirect && !instance_count))
+ if (unlikely(!IS_DRAW_VERTEX_STATE && !indirect && !instance_count))
return;
struct si_shader_selector *vs = sctx->shader.vs.cso;
Freezes just using the desktop / file manager / a web browser.
Freezes can happen as quickly as 5-10 minutes after a fresh boot - or several hours.
My hardware info:
- Intel i7 9700k (not overclocked)
- 32 GB DDR4 RAM
- Several SSD's and NVME's
- ASUS TUF AMD 6700 XT
- 2 * ViewSonic VP2768-4k monitors connected via Display Port each
- Resolution 3840x2160 @ 60Hz
- Using 175% scaling in both KDE and GNOME
Log files (for system lockups / game freezes / crashes)
- Backtrace (for crashes)
- Output of
dmesg
- Hang reports: Run with
RADV_DEBUG=hang
and attach the files created in$HOME/radv_dumps_*/
dmesg attached, I'll try RADV_DEBUG=hang
later as soon as I can.
Steps to reproduce
How can Mesa developers reproduce the issue? When reporting a game issue, start explaining from a fresh save file and don't assume prior knowledge of the game's story.
For me I just have to use GNOME for a while and get the freeze. KDE seems more stable, usually related to using Android Studio and/or GoLand.
System information
Please post inxi -GSC -xx
output (fenced with triple backticks) OR fill information below manually
System: Host: maria Kernel: 5.15.2-2-MANJARO x86_64 bits: 64 compiler: gcc v: 11.1.0 Desktop: GNOME 41.1 tk: GTK 3.24.30
wm: gnome-shell dm: GDM Distro: Manjaro Linux base: Arch Linux
CPU: Info: 8-Core model: Intel Core i7-9700K bits: 64 type: MCP arch: Kaby Lake note: check rev: C cache: L1: 512 KiB
L2: 2 MiB L3: 12 MiB
flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 57616
Speed: 1594 MHz min/max: 800/4900 MHz Core speeds (MHz): 1: 1594 2: 1140 3: 2134 4: 3439 5: 3054 6: 2012 7: 1780
8: 4409
Graphics: Device-1: Advanced Micro Devices [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT / 6800M] vendor: ASUSTeK driver: amdgpu
v: kernel bus-ID: 03:00.0 chip-ID: 1002:73df
Display: x11 server: X.org 1.21.1.1 compositor: gnome-shell driver: loaded: modesetting
resolution: <missing: xdpyinfo>
OpenGL: renderer: AMD Radeon RX 6700 XT (NAVY_FLOUNDER DRM 3.42.0 5.15.2-2-MANJARO LLVM 13.0.0)
v: 4.6 Mesa 21.3.1 (git-9da08702b0) direct render: Yes
If applicable
- Xserver version: (
sudo X -version
) 1.21.1.1
Regression
Don't know, this is new hardware and I didn't use it enough before my Ubuntu and Fedora both upgraded.
Further information (optional)
Does the issue reproduce with the LLVM backend (RADV_DEBUG=llvm
) or on the AMDGPU-PRO drivers?
I don't think AMDGPU-PRO supports my distos (Ubuntu 21.10, Fedora 35, Manjaro current).
Does your environment set any of the variables ACO_DEBUG
, RADV_DEBUG
, and RADV_PERFTEST
?
Does not set any of those