[RX 6600 XT] GPU hang & reset; backtrace (refcount underflow, use-after-free, 6.0.10) (Witcher 3, ray-tracing)
Testing the new build of Witcher 3 (and the fresh Proton hotfix which got its DX12 support working).
I switched on ray-tracing and got a slide-show for a few seconds, followed by a desktop freeze. After a few minutes, X exited.
Hardware description:
- CPU: Ryzen 3600
- GPU: RX 6600 XT
- Display(s): 2× 1080p, primary at 120Hz and secondary at 60Hz; also Vive, not enabled
- Type of Display Connection: Primary DP, secondary HDMI
System information:
- Devuan stable (LLVM 14 & libdrm pulled in from testing)
- Kernel version: 6.0.10 (local build)
- Mesa 22.3.0 (local build, based on 22.3.0~kisak2~j)
Logs
I see nothing relevant showing up in the X log.
Kernel log extract:
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401431
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x1
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x3
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401431
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x1
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x3
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401431
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x1
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x3
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
amdgpu 0000:28:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32781, for process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781)
amdgpu 0000:28:00.0: amdgpu: in page starting at address 0x000080003066c000 from client 0x1b (UTCL2)
amdgpu 0000:28:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
amdgpu 0000:28:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
amdgpu 0000:28:00.0: amdgpu: MORE_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: WALKER_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: PERMISSION_FAULTS: 0x0
amdgpu 0000:28:00.0: amdgpu: MAPPING_ERROR: 0x0
amdgpu 0000:28:00.0: amdgpu: RW: 0x0
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring comp_1.3.0 timeout, signaled seq=604410, emitted seq=604411
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process witcher3.exe pid 24664 thread WorkSubmissionT pid 24781
amdgpu 0000:28:00.0: amdgpu: GPU reset begin!
[drm] free PSP TMR buffer
amdgpu 0000:28:00.0: amdgpu: MODE1 reset
amdgpu 0000:28:00.0: amdgpu: GPU mode1 reset
amdgpu 0000:28:00.0: amdgpu: GPU smu mode1 reset
amdgpu 0000:28:00.0: amdgpu: GPU reset succeeded, trying to resume
[drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
[drm] VRAM is lost due to GPU reset!
[drm] PSP is resuming...
[drm] reserve 0xa00000 from 0x80f8000000 for PSP TMR
amdgpu 0000:28:00.0: amdgpu: RAS: optional ras ta ucode is not available
amdgpu 0000:28:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
amdgpu 0000:28:00.0: amdgpu: SMU is resuming...
amdgpu 0000:28:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2900 (59.41.0)
amdgpu 0000:28:00.0: amdgpu: SMU driver if version not matched
amdgpu 0000:28:00.0: amdgpu: use vbios provided pptable
amdgpu 0000:28:00.0: amdgpu: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x02020013
[drm] kiq ring mec 2 pipe 1 q 0
[drm] VCN decode and encode initialized successfully(under DPG Mode).
[drm] JPEG decode initialized successfully.
amdgpu 0000:28:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
amdgpu 0000:28:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1
amdgpu 0000:28:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1
amdgpu 0000:28:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1
amdgpu 0000:28:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1
amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow start
amdgpu 0000:28:00.0: amdgpu: recover vram bo from shadow done
[drm] Skip scheduling IBs!
amdgpu 0000:28:00.0: amdgpu: GPU reset(1) succeeded!
[drm] Skip scheduling IBs!
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 4 PID: 788 at refcount_warn_saturate+0xba/0x110
Modules linked in: snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq bnep cpufreq_conservative md4 vfat fat nct6775 nct6775_core hwmon_vid em28xx_rc em28xx_dvb pktcdvd btusb btbcm si2157 btintel si2168
amdgpu_cs_ioctl: 150 callbacks suppressed
snd_usb_audio uvcvideo snd_hda_codec_realtek em28xx snd_hda_codec_generic snd_usbmidi_lib bluetooth snd_hwdep videobuf2_vmalloc i2c_mux videobuf2_memops snd_rawmidi videobuf2_v4l2 videobuf2_common dvb_usb_dvbsky joydev tveeprom snd_seq_device ledtrig_audio snd_hda_codec_hdmi amdgpu snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core drm_ttm_helper ttm mfd_core gpu_sched drm_buddy snd_pcm_oss drm_display_helper snd_mixer_oss snd_pcm drm_kms_helper snd_timer sg k10temp syscopyarea sysfillrect sysimgblt sp5100_tco fb_sys_fops efivarfs
CPU: 4 PID: 788 Comm: gfx_0.0.0 Tainted: G D 6.0.10 #1
Hardware name: Micro-Star International Co., Ltd MS-7C02/B450 TOMAHAWK MAX (MS-7C02), BIOS 3.C3 09/27/2021
RIP: 0010:refcount_warn_saturate+0xba/0x110
Code: 00 01 e8 62 55 43 00 0f 0b e9 22 6e 80 00 80 3d 2d 1d f8 00 00 75 85 48 c7 c7 70 2e 06 82 c6 05 1d 1d f8 00 01 e8 3f 55 43 00 <0f> 0b e9 ff 6d 80 00 80 3d 08 1d f8 00 00 0f 85 5e ff ff ff 48 c7
RSP: 0018:ffffc90000cbbea0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88810a9a9658 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff8204e8cb RDI: 00000000ffffffff
RBP: ffff88810a9a9598 R08: 0000000000000000 R09: 00000000ffffdfff
R10: ffffc90000cbbd40 R11: ffffffff82313068 R12: ffff8881c6d350a8
R13: ffff88810a9a9710 R14: 0000000000000000 R15: ffff888188cd60c0
FS: 0000000000000000(0000) GS:ffff88842ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000358037ea400 CR3: 000000015f5ea000 CR4: 0000000000350ee0
[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Call Trace:
<TASK>
drm_sched_main+0x69/0x3e0 [gpu_sched]
? destroy_sched_domains_rcu+0x20/0x20
? drm_sched_job_done_cb+0x10/0x10 [gpu_sched]
kthread+0xe2/0x110
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
---[ end trace 0000000000000000 ]---
[drm] Skip scheduling IBs!
[drm] Skip scheduling IBs!
[… 110 more of the above line…]
[drm] Skip scheduling IBs!
[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125!
Edited by Darren Salt