5600 XT warning during gpu reset
Brief summary of the problem:
My gpu crashes a couple times due to a gpu reset. Usually the system freezes for a bit and when it becomes responsive agin theres some graphical artifacts appear and the displays stays frozen. Sometimes I can recover by restarting X on another tty but other times I need to reboot.
I've seen a frew similar issues already reported, but the log was different so this might be a different issue.
Hardware description:
- CPU: AMD Ryzen 7 3700X
- GPU: Saphire Pulse RX 5600 XT
- System Memory: G.Skill Ripjaws V 16 GB (2 x 8 GB) DDR4-3600 CL18
- Display(s): A 1440p acer and a 1900x1200 Dell
- Type of Diplay Connection: both displays use display port
System infomration:
- Distro name and Version: arch linux
- Kernel version: 5.8.5
- Custom kernel: no
- AMD package version: No package
- mesa 20.1.7-1
- xf86-video-amdgpu 19.1.0
How to reproduce the issue:
The crash seems pretty random, I'm usually not doing anything very intensive when it happens.
Here's the relevant section of the log. Full log is attached
Sep 02 17:50:28 phAMD kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Sep 02 17:50:33 phAMD kernel: [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for fences timed out!
Sep 02 17:50:33 phAMD kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=5933540, emitted seq=5933542
Sep 02 17:50:33 phAMD kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 665 thread Xorg:cs0 pid 676
Sep 02 17:50:33 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset begin!
Sep 02 17:50:33 phAMD kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:928
Sep 02 17:50:33 phAMD kernel: ------------[ cut here ]------------
Sep 02 17:50:33 phAMD kernel: WARNING: CPU: 15 PID: 2830 at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:526 generic_reg_wait.cold+0x26/0x2d [amdgpu]
Sep 02 17:50:33 phAMD kernel: Modules linked in: fuse uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi xpad snd_seq_device wacom mc ff_memless cfg80211 mousedev input_leds joydev rfkill 8021q garp mrp stp llc nct6775 hwmon_vid snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi nls_iso8859_1 amdgpu nls_cp437 snd_hda_intel vfat fat snd_intel_dspcfg edac_mce_amd snd_hda_codec gpu_sched i2c_algo_bit ttm kvm_amd snd_hda_core snd_hwdep kvm drm_kms_helper snd_pcm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_timer r8169 cec aesni_intel ccp rc_core snd crypto_simd sp5100_tco syscopyarea cryptd realtek sysfillrect glue_helper sysimgblt wmi_bmof pcspkr libphy fb_sys_fops k10temp rapl i2c_piix4 soundcore rng_core evdev acpi_cpufreq pinctrl_amd mac_hid gpio_amdpt drm agpgart ip_tables x_tables hid_generic usbhid hid btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq xhci_pci crc32c_intel
Sep 02 17:50:33 phAMD kernel: xhci_pci_renesas xhci_hcd wmi
Sep 02 17:50:33 phAMD kernel: CPU: 15 PID: 2830 Comm: kworker/15:0 Tainted: G W 5.8.5-arch1-1 #1
Sep 02 17:50:33 phAMD kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B550M Pro4, BIOS P1.10 06/12/2020
Sep 02 17:50:33 phAMD kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
Sep 02 17:50:33 phAMD kernel: RIP: 0010:generic_reg_wait.cold+0x26/0x2d [amdgpu]
Sep 02 17:50:33 phAMD kernel: Code: 61 3f fd ff 44 8b 44 24 24 48 8b 4c 24 18 89 ee 48 c7 c7 38 d7 10 c1 8b 54 24 20 e8 94 04 50 cb 83 7b 20 01 0f 84 7d 50 fd ff <0f> 0b e9 76 50 fd ff 48 c7 c7 fd d1 0b c1 e8 4d 17 91 cb e8 24 e1
Sep 02 17:50:33 phAMD kernel: RSP: 0018:ffffa4ae80d4fad8 EFLAGS: 00010297
Sep 02 17:50:33 phAMD kernel: RAX: 0000000000000041 RBX: ffff8ba4482f7e00 RCX: 0000000000000000
Sep 02 17:50:33 phAMD kernel: RDX: 0000000000000000 RSI: ffffffff8d76ac7f RDI: 00000000ffffffff
Sep 02 17:50:33 phAMD kernel: RBP: 0000000000000001 R08: 000000000000067d R09: 0000000000000001
Sep 02 17:50:33 phAMD kernel: R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000001
Sep 02 17:50:33 phAMD kernel: R13: 0000000000000002 R14: 0000000000003ab3 R15: 00000000000000c9
Sep 02 17:50:33 phAMD kernel: FS: 0000000000000000(0000) GS:ffff8ba45ebc0000(0000) knlGS:0000000000000000
Sep 02 17:50:33 phAMD kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 02 17:50:33 phAMD kernel: CR2: 00007f69436e7ffc CR3: 00000003bb25e000 CR4: 0000000000340ee0
Sep 02 17:50:33 phAMD kernel: Call Trace:
Sep 02 17:50:33 phAMD kernel: hubp2_set_blank+0xc4/0xd0 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dcn10_wait_for_mpcc_disconnect+0xdb/0x130 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dcn20_plane_atomic_disable+0x3e/0x150 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dcn20_disable_plane+0x24/0x40 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dcn20_post_unlock_program_front_end+0x6f/0x1b0 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dc_commit_state+0x687/0x970 [amdgpu]
Sep 02 17:50:33 phAMD kernel: ? kernel_fpu_end+0x1e/0x30
Sep 02 17:50:33 phAMD kernel: ? dc_rem_all_planes_for_stream+0xcb/0x110 [amdgpu]
Sep 02 17:50:33 phAMD kernel: amdgpu_dm_commit_zero_streams+0x12d/0x140 [amdgpu]
Sep 02 17:50:33 phAMD kernel: dm_suspend+0x9a/0xb0 [amdgpu]
Sep 02 17:50:33 phAMD kernel: amdgpu_device_ip_suspend_phase1+0x83/0xe0 [amdgpu]
Sep 02 17:50:33 phAMD kernel: ? amdgpu_fence_process+0x4d/0x140 [amdgpu]
Sep 02 17:50:33 phAMD kernel: amdgpu_device_ip_suspend+0x1c/0x60 [amdgpu]
Sep 02 17:50:33 phAMD kernel: amdgpu_device_gpu_recover.cold+0x653/0xfd4 [amdgpu]
Sep 02 17:50:33 phAMD kernel: amdgpu_job_timedout+0x121/0x140 [amdgpu]
Sep 02 17:50:33 phAMD kernel: drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
Sep 02 17:50:33 phAMD kernel: process_one_work+0x1da/0x3d0
Sep 02 17:50:33 phAMD kernel: worker_thread+0x4d/0x3d0
Sep 02 17:50:33 phAMD kernel: ? rescuer_thread+0x410/0x410
Sep 02 17:50:33 phAMD kernel: kthread+0x142/0x160
Sep 02 17:50:33 phAMD kernel: ? __kthread_bind_mask+0x60/0x60
Sep 02 17:50:33 phAMD kernel: ret_from_fork+0x22/0x30
Sep 02 17:50:33 phAMD kernel: ---[ end trace ec16cbe45c87499b ]---
Sep 02 17:50:33 phAMD kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:928
Sep 02 17:50:33 phAMD kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:928
Sep 02 17:50:33 phAMD kernel: [drm] REG_WAIT timeout 1us * 200 tries - hubp2_set_blank line:928
Sep 02 17:50:33 phAMD kernel: amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Sep 02 17:50:33 phAMD kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KGQ disable failed
Sep 02 17:50:33 phAMD kernel: amdgpu 0000:0b:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Sep 02 17:50:33 phAMD kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* KCQ disable failed
Sep 02 17:50:34 phAMD kernel: [drm:gfx_v10_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
Sep 02 17:50:34 phAMD kernel: [drm] free PSP TMR buffer
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset succeeded, trying to resume
Sep 02 17:50:36 phAMD kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000000000).
Sep 02 17:50:36 phAMD kernel: [drm] VRAM is lost due to GPU reset!
Sep 02 17:50:36 phAMD kernel: [drm] PSP is resuming...
Sep 02 17:50:36 phAMD kernel: [drm] reserve 0x900000 from 0x8170000000 for PSP TMR
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: RAS: optional ras ta ucode is not available
Sep 02 17:50:36 phAMD kernel: amdgpu: SMU is resuming...
Sep 02 17:50:36 phAMD kernel: amdgpu: SMU is resumed successfully!
Sep 02 17:50:36 phAMD kernel: [drm] kiq ring mec 2 pipe 1 q 0
Sep 02 17:50:36 phAMD kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode).
Sep 02 17:50:36 phAMD kernel: [drm] JPEG decode initialized successfully.
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring vcn_dec uses VM inv eng 0 on hub 1
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc0 uses VM inv eng 1 on hub 1
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring vcn_enc1 uses VM inv eng 4 on hub 1
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
Sep 02 17:50:36 phAMD kernel: [drm] recover vram bo from shadow start
Sep 02 17:50:36 phAMD kernel: [drm] recover vram bo from shadow done
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: amdgpu 0000:0b:00.0: amdgpu: GPU reset(2) succeeded!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!
Sep 02 17:50:36 phAMD kernel: [drm] Skip scheduling IBs!