drm/dri radeonsi crashes while watching video or during video call
System information
System: Host: lenovo Kernel: 5.9.8-arch1-1 x86_64 bits: 64 compiler: N/A Desktop: GNOME 3.38.1 tk: GTK 3.24.23
wm: gnome-shell dm: GDM Distro: Arch Linux
CPU: Info: 8-Core model: Intel Core i7-9700K bits: 64 type: MCP arch: Kaby Lake rev: D L2 cache: 12.0 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 57616
Speed: 900 MHz min/max: 800/4900 MHz Core speeds (MHz): 1: 900 2: 900 3: 900 4: 900 5: 900 6: 899 7: 900 8: 900
Graphics: Device-1: Intel UHD Graphics 630 vendor: Lenovo driver: i915 v: kernel bus ID: 00:02.0 chip ID: 8086:3e98
Device-2: Advanced Micro Devices [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] vendor: XFX Pine
driver: amdgpu v: kernel bus ID: 03:00.0 chip ID: 1002:731f
Display: x11 server: X.Org 1.20.9 compositor: gnome-shell driver: amdgpu,intel unloaded: modesetting,vesa
alternate: ati,fbdev resolution: 1: 2560x1080~60Hz 2: 2560x1080~60Hz 3: 1920x1080 s-dpi: 96
OpenGL: renderer: AMD Radeon RX 5700 XT (NAVI10 DRM 3.39.0 5.9.8-arch1-1 LLVM 11.0.0) v: 4.6 Mesa 20.2.2
direct render: Yes
Describe the issue
The GPU resets while watching videos in Chrome/Firefox on youtube and also while having video calls in Teams. It happens pretty often now, at least a couple of times a day. I have a multiple monitor setup. I've tried everything that I could find in the Arch wiki or online, nothing helped.dmesg_backtrace_gpu_reset.log
Regression
It started recently, maybe a couple of months back. Didn't really pay attention at first because AMD video drivers are known to be buggy.
dmesg backtrace:
....
[ 7443.594896] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 7443.595181] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 7443.595482] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
[ 7448.504955] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[ 7535.914559] gnome-shell[1417]: segfault at 5559435f04f0 ip 00005559435f04f0 sp 00007ffeefb7dc68 error 15
[ 7535.914562] Code: 00 00 03 00 00 00 00 00 00 00 42 53 54 00 00 00 00 00 21 00 00 00 00 00 00 00 00 05 5f 43 59 55 00 00 03 00 00 00 00 00 00 00 <47> 4d 54 00 00 00 00 00 21 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 7535.914639] audit: type=1701 audit(1605147975.256:138): auid=1000 uid=1000 gid=1000 ses=4 pid=1417 comm="gnome-shell" exe="/usr/bin/gnome-shell" sig=11 res=1
[ 7535.928403] audit: type=1334 audit(1605147975.270:139): prog-id=23 op=LOAD
[ 7535.928530] audit: type=1334 audit(1605147975.270:140): prog-id=24 op=LOAD
[ 7535.929168] audit: type=1130 audit(1605147975.270:141): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@0-26390-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[ 7537.716638] audit: type=1334 audit(1605147977.060:142): prog-id=25 op=LOAD
[ 7537.716758] audit: type=1334 audit(1605147977.060:143): prog-id=26 op=LOAD
[ 7537.926142] audit: type=1130 audit(1605147977.266:144): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-hostnamed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? add r=? terminal=? res=success'
[ 7538.048112] ------------[ cut here ]------------
[ 7538.048181] WARNING: CPU: 7 PID: 1294 at drivers/gpu/drm/amd/amdgpu/../display/dc/dcn20/dcn20_hwseq.c:111 dcn20_setup_gsl_group_as_lock+0x83/0x240 [amdgpu]
[ 7538.048182] Modules linked in: tun nls_utf8 isofs cdrom fuse uvcvideo videobuf2_vmalloc videobuf2_memops uas videobuf2_v4l2 usb_storage videobuf2_common btusb btrtl btbcm btintel bluetooth ecdh_generic ecc hid_logitech_hidpp mousedev joydev hid_logitech_dj snd_usb_audio snd_usbmidi_lib snd_rawmidi ti_usb_3410_5052 snd_seq_device input_leds hid_generic usbhid hid snd_hda_codec_realtek snd_hda_codec_generic rfkill snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof ledtrig_audio snd_soc_skl snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi intel_rapl_msr iTCO_wdt intel_rapl_common mei_wdt snd_soc_core intel_pmc_bxt mei_hdcp ee1004 iTCO_vendor_support 8250_dw snd_compress wmi_bmof intel_wmi_thunderbolt x86_pkg_temp_thermal snd_hda_codec_hdmi ac97_bus intel_powerclamp snd_pcm_dmaengine coretemp snd_hda_intel kvm_intel kvm snd_intel_dspcfg irqbypass crct10dif_pclmul
[ 7538.048195] crc32_pclmul ghash_clmulni_intel snd_hda_codec aesni_intel nls_iso8859_1 nls_cp437 crypto_simd cryptd vfat glue_helper rapl i915 fat snd_hda_core intel_cstate e1000e pcspkr intel_uncore ofpart snd_hwdep cmdlinepart snd_pcm intel_spi_pci intel_spi snd_timer i2c_i801 spi_nor snd mei_me mtd i2c_smbus tpm_crb soundcore mei intel_gtt intel_lpss_pci intel_lpss idma64 wmi tpm_tis tpm_tis_core tpm rng_core evdev mac_hid acpi_tad vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) v4l2loopback(OE) videodev mc sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci xhci_pci_renesas crc32c_intel xhci_hcd amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart
[ 7538.048210] CPU: 7 PID: 1294 Comm: Xorg Tainted: G OE 5.9.8-arch1-1 #1
[ 7538.048210] Hardware name: LENOVO 30D0S3YL02/314F, BIOS M1VKT59A 07/07/2020
[ 7538.048264] RIP: 0010:dcn20_setup_gsl_group_as_lock+0x83/0x240 [amdgpu]
[ 7538.048265] Code: 84 c0 75 45 48 8b 87 78 03 00 00 0f b6 80 70 02 00 00 a8 01 0f 84 2c 01 00 00 a8 02 0f 84 86 00 00 00 a8 04 0f 84 43 01 00 00 <0f> 0b b9 81 00 00 00 48 c7 c2 e0 e0 5d c0 bf 02 00 00 00 48 c7 c6
[ 7538.048265] RSP: 0018:ffff9331c2eff7f0 EFLAGS: 00010202
[ 7538.048266] RAX: 0000000000000007 RBX: ffff89ffa1a80690 RCX: 0000000000000000
[ 7538.048267] RDX: 0000000000000001 RSI: ffff89ffa1a80690 RDI: ffff8a035d0c0000
[ 7538.048267] RBP: 0000000000000001 R08: ffff9331c2eff7bc R09: 0000000000000000
[ 7538.048268] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000001
[ 7538.048268] R13: ffff8a035d0c0000 R14: ffff8a021e311800 R15: ffff89ffa1a80000
[ 7538.048269] FS: 00007f0730320540(0000) GS:ffff8a036c5c0000(0000) knlGS:0000000000000000
[ 7538.048270] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7538.048270] CR2: 000055b640584000 CR3: 0000000fef1ac006 CR4: 00000000003726e0
[ 7538.048271] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7538.048271] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7538.048272] Call Trace:
[ 7538.048324] dcn20_pipe_control_lock+0x244/0x2a0 [amdgpu]
[ 7538.048374] dcn10_lock_all_pipes+0x91/0xc0 [amdgpu]
[ 7538.048421] dc_commit_updates_for_stream+0xe67/0x1bf0 [amdgpu]
[ 7538.048465] ? dc_stream_get_scanoutpos+0x5f/0x70 [amdgpu]
[ 7538.048512] ? dm_crtc_get_scanoutpos+0x85/0xe0 [amdgpu]
[ 7538.048559] amdgpu_dm_atomic_commit_tail+0x13e1/0x23b0 [amdgpu]
[ 7538.048569] commit_tail+0x94/0x130 [drm_kms_helper]
[ 7538.048573] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
[ 7538.048585] drm_mode_obj_set_property_ioctl+0x156/0x3d0 [drm]
[ 7538.048593] ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
[ 7538.048598] drm_ioctl_kernel+0xb2/0x100 [drm]
[ 7538.048604] drm_ioctl+0x215/0x390 [drm]
[ 7538.048611] ? drm_mode_obj_find_prop_id+0x40/0x40 [drm]
[ 7538.048636] amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[ 7538.048639] __x64_sys_ioctl+0x83/0xb0
[ 7538.048641] do_syscall_64+0x33/0x40
[ 7538.048642] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 7538.048643] RIP: 0033:0x7f0730ce3f6b
[ 7538.048644] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 ae
0c 00 f7 d8 64 89 01 48
[ 7538.048645] RSP: 002b:00007fffb4e81ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 7538.048645] RAX: ffffffffffffffda RBX: 00007fffb4e81ee0 RCX: 00007f0730ce3f6b
[ 7538.048646] RDX: 00007fffb4e81ee0 RSI: 00000000c01864ba RDI: 000000000000000c
[ 7538.048646] RBP: 00000000c01864ba R08: 0000000000000072 R09: 00000000cccccccc
[ 7538.048647] R10: 0000000000000fff R11: 0000000000000246 R12: 000055b63f3d09e0
[ 7538.048647] R13: 000000000000000c R14: 0000000000000000 R15: 0000000000000003
[ 7538.048649] ---[ end trace 3cebf8f55783e0b1 ]---
....
GDM backtrace from journalctl:
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE)
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) Backtrace:
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 0: /usr/lib/Xorg (xorg_backtrace+0x53) [0x55b63e394bd3]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 1: /usr/lib/Xorg (0x55b63e24e000+0x151a15) [0x55b63e39fa15]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 2: /usr/lib/libc.so.6 (0x7f0730bed000+0x3d6a0) [0x7f0730c2a6a0]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 3: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0xe1aa81) [0x7f072bb99a81]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 4: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x8112ed) [0x7f072b5902ed]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 5: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x852549) [0x7f072b5d1549]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 6: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x8472e5) [0x7f072b5c62e5]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 7: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x857aa3) [0x7f072b5d6aa3]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 8: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x83d8d9) [0x7f072b5bc8d9]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 9: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x824b29) [0x7f072b5a3b29]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 10: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0xaeaa3f) [0x7f072b869a3f]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 11: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x13b3bf) [0x7f072aeba3bf]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 12: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x2d4607) [0x7f072b053607]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 13: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x2d481c) [0x7f072b05381c]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 14: /usr/lib/dri/radeonsi_dri.so (0x7f072ad7f000+0x2d4905) [0x7f072b053905]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 15: /usr/lib/xorg/modules/libglamoregl.so (0x7f072c4e4000+0x26cb7) [0x7f072c50acb7]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 16: /usr/lib/xorg/modules/libglamoregl.so (0x7f072c4e4000+0xf614) [0x7f072c4f3614]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 17: /usr/lib/xorg/modules/libglamoregl.so (0x7f072c4e4000+0xfd75) [0x7f072c4f3d75]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 18: /usr/lib/Xorg (miCopyRegion+0x97) [0x55b63e2924d7]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 19: /usr/lib/Xorg (miDoCopy+0x466) [0x55b63e296396]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 20: /usr/lib/xorg/modules/libglamoregl.so (0x7f072c4e4000+0x8834) [0x7f072c4ec834]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 21: /usr/lib/Xorg (0x55b63e24e000+0xc58d6) [0x55b63e3138d6]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 22: /usr/lib/Xorg (0x55b63e24e000+0x73233) [0x55b63e2c1233]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 23: /usr/lib/Xorg (0x55b63e24e000+0x3a165) [0x55b63e288165]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 24: /usr/lib/libc.so.6 (__libc_start_main+0xf2) [0x7f0730c15152]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) 25: /usr/lib/Xorg (_start+0x2e) [0x55b63e2885ae]
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE)
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) Segmentation fault at address 0x7f07107f9000
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE)
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: Fatal server error:
Nov 12 02:26:24 lenovo /usr/lib/gdm-x-session[1294]: (EE) Caught signal 11 (Segmentation fault). Server aborting
Edited by Octavian Stolnicu