RX6700XT: [gfxhub] page fault + [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout
Brief summary of the problem:
GPU hangs randomly, sometimes recovers gracefully afterwards, sometimes kills xorg-server. A snippet of dmesg for the crash:
[ 716.928693] gmc_v10_0_process_interrupt: 46 callbacks suppressed
[ 716.928700] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928710] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180384000 from client 0x1b (UTCL2)
[ 716.928716] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 716.928719] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 716.928723] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x1
[ 716.928727] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928730] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 716.928733] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928735] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928743] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928749] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180285000 from client 0x1b (UTCL2)
[ 716.928753] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031
[ 716.928756] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
[ 716.928759] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x1
[ 716.928762] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928765] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[ 716.928768] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928771] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928777] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928782] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180388000 from client 0x1b (UTCL2)
[ 716.928785] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928788] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928792] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928795] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928797] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928800] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928803] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928810] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928815] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180388000 from client 0x1b (UTCL2)
[ 716.928818] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928821] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928824] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928827] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928830] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928833] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928836] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928842] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928846] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180285000 from client 0x1b (UTCL2)
[ 716.928850] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928853] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928856] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928859] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928861] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928864] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928867] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928874] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928878] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180388000 from client 0x1b (UTCL2)
[ 716.928882] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928884] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928887] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928890] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928893] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928896] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928899] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928906] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928910] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180285000 from client 0x1b (UTCL2)
[ 716.928913] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928916] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928919] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928922] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928925] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928928] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928931] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928937] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928942] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180388000 from client 0x1b (UTCL2)
[ 716.928945] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928948] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928951] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928954] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928957] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928959] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928962] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.928969] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.928973] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180285000 from client 0x1b (UTCL2)
[ 716.928977] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.928979] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.928983] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.928985] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.928988] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.928991] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.928994] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 716.929000] amdgpu 0000:0d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1163 thread Xorg:cs0 pid 1165)
[ 716.929005] amdgpu 0000:0d:00.0: amdgpu: in page starting at address 0x0000800180390000 from client 0x1b (UTCL2)
[ 716.929008] amdgpu 0000:0d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
[ 716.929011] amdgpu 0000:0d:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0)
[ 716.929014] amdgpu 0000:0d:00.0: amdgpu: MORE_FAULTS: 0x0
[ 716.929017] amdgpu 0000:0d:00.0: amdgpu: WALKER_ERROR: 0x0
[ 716.929020] amdgpu 0000:0d:00.0: amdgpu: PERMISSION_FAULTS: 0x0
[ 716.929022] amdgpu 0000:0d:00.0: amdgpu: MAPPING_ERROR: 0x0
[ 716.929025] amdgpu 0000:0d:00.0: amdgpu: RW: 0x0
[ 727.048598] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=26147, emitted seq=26149
[ 727.048951] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1163 thread Xorg:cs0 pid 1165
Number of page fautls varies from hang to hang.
Hardware description:
- CPU: AMD Ryzen 7 5800X 8-Core Processor
- GPU: 0d:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] [1002:73df] (rev c5) / Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] / Navy Flounder / Asus Dual Radeon™ RX 6700 XT OC Edition
- System Memory: 32GiB
- Display(s): Eizo CS240
- Type of Display Connection: DP
System information:
- Distro name and Version: Debian stable 12.2
- Mesa version: 22.3.6-1+deb12u1 (from Debian repos)
- Custom kernel: Linux octo 6.5.7 #6 (closed) SMP PREEMPT_DYNAMIC Sun Oct 22 23:08:35 CEST 2023 x86_64 GNU/Linux (self-built via
make bindeb-pkg
+ candidate patches from #2627 applied) - AMD official driver version: NA, using Linux's amdgpu
- Firmware version - should be from linux-firmware.git at commit a3bcbbf2e5d13b49197ecd39ae47715515bf38c2 (latest at this point).
VCE feature version: 0, firmware version: 0x00000000
UVD feature version: 0, firmware version: 0x00000000
MC feature version: 0, firmware version: 0x00000000
ME feature version: 44, firmware version: 0x00000040
PFP feature version: 44, firmware version: 0x00000061
CE feature version: 44, firmware version: 0x00000025
RLC feature version: 1, firmware version: 0x0000004a
RLC SRLC feature version: 0, firmware version: 0x00000000
RLC SRLG feature version: 0, firmware version: 0x00000000
RLC SRLS feature version: 0, firmware version: 0x00000000
RLCP feature version: 0, firmware version: 0x00000000
RLCV feature version: 0, firmware version: 0x00000000
MEC feature version: 44, firmware version: 0x00000073
MEC2 feature version: 44, firmware version: 0x00000073
IMU feature version: 0, firmware version: 0x00000000
SOS feature version: 0, firmware version: 0x00220a0c
ASD feature version: 553648303, firmware version: 0x210000af
TA XGMI feature version: 0x00000000, firmware version: 0x00000000
TA RAS feature version: 0x00000000, firmware version: 0x00000000
TA HDCP feature version: 0x00000000, firmware version: 0x1700003a
TA DTM feature version: 0x00000000, firmware version: 0x12000015
TA RAP feature version: 0x00000000, firmware version: 0x07000213
TA SECUREDISPLAY feature version: 0x00000000, firmware version: 0x00000000
SMC feature version: 0, program: 0, firmware version: 0x00413b00 (65.59.0)
SDMA0 feature version: 52, firmware version: 0x00000050
SDMA1 feature version: 52, firmware version: 0x00000050
VCN feature version: 0, firmware version: 0x0211d002
DMCU feature version: 0, firmware version: 0x00000000
DMCUB feature version: 0, firmware version: 0x02020020
TOC feature version: 0, firmware version: 0x00000000
MES_KIQ feature version: 0, firmware version: 0x00000000
MES feature version: 0, firmware version: 0x00000000
VBIOS version: 115-D512BS0-100
How to reproduce the issue:
The hang happens at random, but for some reason background music playback (audacious and esp. deadbeef) increases frequency of hang a lot.
Log files (for system lockups / game freezes / crashes)
- Dmesg log (full log)dmesg.Fri_Oct_27_07_56_33_PM_CEST_2023.log
-
Ring gfx_0.0.0 dump via umr collected for crash with
gpu_recovery=0
. I can also upload a binary snapshot of this ring (copied from debugfs viacp
) if it's any useful. - amdgpu_fence_info
- lshw output
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Alex Deucher added 6000 dGPU Series label
added 6000 dGPU Series label
- Owner
Does a different version of mesa help?
Collapse replies - Author
I've installed mesa 23.3.0-rc1 (overwriting system packages), it did not help, the symptoms are the same. The hang has happened when scrolling in the Firefox window.
I can collect more information and apply some patches for this if necessary.
I'm having the same issue. ring gfx error and then a soft reset. after the first reset i usually get a hard reset and need to reboot completely. the issue primarily happens when using Ardour and Libreoffice presenter at the same time but also happens when only using Ardour. I have tried with Xorg and Wayland and get the same result. I have also added the kernel parameter amdgpu.mcbp=0 but apparently this only helps people who are experiencing this issue if they have GX9 series cards.
Hardware: Lenovo Thinkpad Z16 Ryzen 6850H (has onboard gpu Radeon 680M) Radeon RX 6500M GPU 16 Gig system memory
[drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx_0.0.0 timeout, signaled seq=433823, emitted seq=433825 fedora kernel: [drm:amdgpu_job_timedout [amdgpu]] ERROR Process information: process Xorg pid 2189 thread Xorg:cs0 pid 2319 fedora kernel: amdgpu 0000:67:00.0: amdgpu: GPU reset begin! fedora kernel: amdgpu 0000:67:00.0: amdgpu: MODE2 reset fedora kernel: amdgpu 0000:67:00.0: amdgpu: GPU reset succeeded, trying to resume
I'm also having this issue. It happens in a game, and, weirdly enough, during cutscenes. Image freezes and the system is completely frozen and unresponsive, even though the sound of the cutscenes keeps going. A hardware reboot is necessary.
FWIW, it's working on Steam Deck at the moment, I am able to pick up where I left off, keep going further than the cutscene, save, and get back to playing on my desktop. This happens in every cutscene in this game (Resident Evil4 4: Seperate Ways). In case this would be relevant, this is the system information from the current SteamOS:
- OS version: 3.4.11
- OS build: 20231005.1
- Kernel version: 5.13-valve37-1-neptune
Hardware description:
- CPU: AMD Ryzen 3700X
- GPU: 0a:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] [1002:73ff] (rev c1)
- System Memory: 2 x 16GB (F4-3600C16-16GTZNC)
- Display(s): 2 x LG 27GL850B
- Type of Display Connection: DP
System information:
- Distro name and Version: Debian testing
- Kernel version: Linux desktop 6.5.0-3-amd64 #1 (closed) SMP PREEMPT_DYNAMIC Debian 6.5.8-1 (2023-10-22) x86_64 GNU/Linux
- Custom kernel: N/A
- AMD official driver version: N/A
Relevant part of
dmesg
:Oct 30 00:04:04 desktop kernel: umip: re4.exe[6504] ip:1560024a5 sp:418738: SGDT instruction cannot be used by application> Oct 30 00:04:04 desktop kernel: umip: re4.exe[6504] ip:1560024a5 sp:418738: For now, expensive software emulation returns > Oct 30 00:04:04 desktop kernel: umip: re4.exe[6504] ip:14ea19e17 sp:41c9e8: SGDT instruction cannot be used by application> Oct 30 00:04:04 desktop kernel: umip: re4.exe[6504] ip:14ea19e17 sp:41c9e8: For now, expensive software emulation returns > Oct 30 00:04:04 desktop kernel: umip: re4.exe[6504] ip:1545b7f36 sp:41d6f8: SGDT instruction cannot be used by application> Oct 30 00:05:08 desktop kernel: gmc_v10_0_process_interrupt: 38 callbacks suppressed Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x000083b93a93b000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401031 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x1 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x3 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000060000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401031 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x1 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x3 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000000000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000000000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x000000005001c000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000050000000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000020000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000010000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000000014000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000e39d38d39000 from client > Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:08 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: gmc_v10_0_process_interrupt: 122 callbacks suppressed Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010374000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401031 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x1 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x3 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010378000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x000000001037c000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010380000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x000000001036c000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010370000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010388000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x000000001038c000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010390000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32777, for> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: in page starting at address 0x0000000010394000 from client > Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MORE_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: WALKER_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: PERMISSION_FAULTS: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MAPPING_ERROR: 0x0 Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RW: 0x0 Oct 30 00:05:18 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=53091, emi> Oct 30 00:05:18 desktop kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process re4.exe pid 6504 t> Oct 30 00:05:18 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset begin! Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: MODE1 reset Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GPU mode1 reset Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GPU smu mode1 reset Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset succeeded, trying to resume Oct 30 00:05:19 desktop kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000900000). Oct 30 00:05:19 desktop kernel: [drm] VRAM is lost due to GPU reset! Oct 30 00:05:19 desktop kernel: [drm] PSP is resuming... Oct 30 00:05:19 desktop kernel: [drm] reserve 0xa00000 from 0x81fd000000 for PSP TMR Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: RAS: optional ras ta ucode is not available Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: SMU is resuming... Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x000> Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: SMU driver if version not matched Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: use vbios provided pptable Oct 30 00:05:19 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: SMU is resumed successfully! Oct 30 00:05:19 desktop kernel: [drm] DMUB hardware initialized: version=0x02020017 Oct 30 00:05:20 desktop kernel: [drm] kiq ring mec 2 pipe 1 q 0 Oct 30 00:05:20 desktop kernel: [drm] VCN decode and encode initialized successfully(under DPG Mode). Oct 30 00:05:20 desktop kernel: [drm] JPEG decode initialized successfully. Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8 Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow start Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow done Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: amdgpu 0000:0a:00.0: amdgpu: GPU reset(2) succeeded! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs! Oct 30 00:05:20 desktop kernel: [drm] Skip scheduling IBs!
Edited by Julien-BenjaminHello
I've had similar issues with this GPU (since i got this gpu more than a year ago) usually it's hard lock that require a complete system reboot.
I'm including my logs because i got a Stack Trace too.
TLDR : After ram change crash less frequent. (Click to expand)
I'm in the process of changing each part and running the system for weeks to see if it's 100% the GPU.
My last change was RAM even if my 2 sticks are only 1 year old and they passed Memtest86 multiple times.
Honesly it could just be luck but as you can see in inxi 3 days uptime until a crash and it wasn't a hard one that's a first. (for me)
I have the exact same setup as OP
Inxi
System: Kernel: 6.2.0-36-generic arch: x86_64 bits: 64 Distro: Ubuntu 23.04 (Lunar Lobster) Machine: Type: Desktop System: ASUS product: N/A v: N/A serial: N/A Mobo: ASUSTeK model: ROG STRIX B550-A GAMING v: Rev X.0x serial: CENSORED UEFI: American Megatrends v: 3404 date: 10/07/2023 CPU: Info: 8-core AMD Ryzen 7 5800X [MT MCP] speed (MHz): avg: 2400 min/max: 2200/4850 Graphics: Device-1: AMD Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M/6850M XT] driver: amdgpu v: kernel Display: x11 server: X.org v: 1.21.1.7 driver: X: loaded: amdgpu unloaded: fbdev,modesetting,radeon,vesa dri: radeonsi gpu: amdgpu tty: 190x36 resolution: 1: 1920x1080 2: 1920x1080 API: OpenGL Message: GL data unavailable in console for root. Network: Device-1: Intel Ethernet I225-V driver: igc Drives: Local Storage: total: 3.6 TiB used: 488.96 GiB (13.3%) Info: Uptime: 3d 5h 2m Memory: 31.25 GiB used: 3.59 GiB (11.5%) Init: systemd target: graphical (5) Shell: Zsh inxi: 3.3.25
dmesg (Click to expand)
[276894.751969] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.751975] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x000080015320a000 from client 0x1b (UTCL2) [276894.751978] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276894.751980] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276894.751982] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276894.751984] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.751985] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276894.751987] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.751988] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.751993] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.751996] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800153214000 from client 0x1b (UTCL2) [276894.751998] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276894.751999] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276894.752001] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276894.752002] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752004] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276894.752005] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752007] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752012] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752015] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800153215000 from client 0x1b (UTCL2) [276894.752018] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276894.752020] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276894.752022] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276894.752024] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752026] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276894.752028] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752029] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752034] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752037] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x000080015320b000 from client 0x1b (UTCL2) [276894.752040] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752042] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752044] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752046] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752048] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752050] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752052] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752057] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752060] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e1a000 from client 0x1b (UTCL2) [276894.752063] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752065] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752067] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752069] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752071] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752073] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752075] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752079] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752083] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e0a000 from client 0x1b (UTCL2) [276894.752085] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752087] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752089] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752091] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752093] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752095] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752097] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752102] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752105] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e0b000 from client 0x1b (UTCL2) [276894.752107] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752109] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752111] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752113] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752115] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752117] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752119] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752124] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752127] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e0c000 from client 0x1b (UTCL2) [276894.752129] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752131] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752133] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752135] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752137] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752139] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752141] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752146] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752149] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e0d000 from client 0x1b (UTCL2) [276894.752151] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752153] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752155] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752157] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752159] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752161] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752163] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276894.752168] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276894.752171] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994e0e000 from client 0x1b (UTCL2) [276894.752173] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276894.752175] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276894.752178] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276894.752179] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276894.752181] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276894.752183] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276894.752185] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946431] gmc_v10_0_process_interrupt: 62 callbacks suppressed [276904.946435] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946442] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994edf000 from client 0x1b (UTCL2) [276904.946446] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276904.946448] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276904.946451] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276904.946453] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946455] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276904.946457] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946459] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946464] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946468] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed1000 from client 0x1b (UTCL2) [276904.946471] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276904.946473] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276904.946475] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276904.946477] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946479] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276904.946481] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946483] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946488] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946491] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed0000 from client 0x1b (UTCL2) [276904.946494] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276904.946496] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276904.946498] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276904.946500] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946502] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276904.946504] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946506] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946511] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946514] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed3000 from client 0x1b (UTCL2) [276904.946516] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00101031 [276904.946518] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [276904.946520] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 [276904.946522] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946524] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [276904.946526] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946528] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946533] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946536] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ee6000 from client 0x1b (UTCL2) [276904.946539] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946541] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946543] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946545] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946547] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946549] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946551] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946556] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946559] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed6000 from client 0x1b (UTCL2) [276904.946561] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946563] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946565] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946567] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946569] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946571] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946573] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946578] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946581] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed2000 from client 0x1b (UTCL2) [276904.946584] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946586] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946588] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946590] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946592] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946594] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946596] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946601] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946604] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed5000 from client 0x1b (UTCL2) [276904.946606] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946608] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946610] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946612] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946614] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946616] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946618] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946623] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946626] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ee7000 from client 0x1b (UTCL2) [276904.946628] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946630] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946632] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946634] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946636] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946638] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946640] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.946645] amdgpu 0000:08:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:1 pasid:32769, for process Xorg pid 1919 thread Xorg:cs0 pid 2236) [276904.946648] amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x0000800994ed8000 from client 0x1b (UTCL2) [276904.946651] amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [276904.946653] amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: CB/DB (0x0) [276904.946655] amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x0 [276904.946657] amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 [276904.946658] amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [276904.946660] amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 [276904.946662] amdgpu 0000:08:00.0: amdgpu: RW: 0x0 [276904.950741] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered [276915.196169] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=140360771, emitted seq=140360774 [276915.196429] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process Xorg pid 1919 thread Xorg:cs0 pid 2236 [276915.196656] amdgpu 0000:08:00.0: amdgpu: GPU reset begin! [276915.480991] amdgpu 0000:08:00.0: amdgpu: MODE1 reset [276915.480996] amdgpu 0000:08:00.0: amdgpu: GPU mode1 reset [276915.481063] amdgpu 0000:08:00.0: amdgpu: GPU smu mode1 reset [276915.986276] amdgpu 0000:08:00.0: amdgpu: GPU reset succeeded, trying to resume [276915.986459] [drm] PCIE GART of 512M enabled (table at 0x00000082FEB00000). [276915.986529] [drm] VRAM is lost due to GPU reset! [276915.986531] [drm] PSP is resuming... [276916.062379] [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR [276916.162375] amdgpu 0000:08:00.0: amdgpu: RAS: optional ras ta ucode is not available [276916.176177] amdgpu 0000:08:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [276916.176181] amdgpu 0000:08:00.0: amdgpu: SMU is resuming... [276916.176186] amdgpu 0000:08:00.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, version = 0x00413900 (65.57.0) [276916.176191] amdgpu 0000:08:00.0: amdgpu: SMU driver if version not matched [276916.176223] amdgpu 0000:08:00.0: amdgpu: use vbios provided pptable [276916.235478] amdgpu 0000:08:00.0: amdgpu: SMU is resumed successfully! [276916.236943] [drm] DMUB hardware initialized: version=0x02020017 [276916.526094] [drm] kiq ring mec 2 pipe 1 q 0 [276916.529532] [drm] VCN decode and encode initialized successfully(under DPG Mode). [276916.529876] [drm] JPEG decode initialized successfully. [276916.529889] amdgpu 0000:08:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [276916.529891] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [276916.529893] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [276916.529894] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 [276916.529895] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 [276916.529897] amdgpu 0000:08:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 [276916.529898] amdgpu 0000:08:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 [276916.529899] amdgpu 0000:08:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 [276916.529901] amdgpu 0000:08:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 [276916.529902] amdgpu 0000:08:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 [276916.529903] amdgpu 0000:08:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [276916.529905] amdgpu 0000:08:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 [276916.529906] amdgpu 0000:08:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1 [276916.529907] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1 [276916.529909] amdgpu 0000:08:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1 [276916.529910] amdgpu 0000:08:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1 [276916.532629] amdgpu 0000:08:00.0: amdgpu: recover vram bo from shadow start [276916.552495] amdgpu 0000:08:00.0: amdgpu: recover vram bo from shadow done [276916.552516] [drm] Skip scheduling IBs! [276916.552531] amdgpu 0000:08:00.0: amdgpu: GPU reset(5) succeeded! [276916.552535] [drm] Skip scheduling IBs! [276916.552541] [drm] Skip scheduling IBs! [276916.552544] [drm] Skip scheduling IBs! [276916.552548] [drm] Skip scheduling IBs! [276916.552551] [drm] Skip scheduling IBs! [276916.552554] [drm] Skip scheduling IBs! [276916.552558] [drm] Skip scheduling IBs! [276916.552560] [drm] Skip scheduling IBs! [276916.552567] [drm] Skip scheduling IBs! [276916.552572] [drm] Skip scheduling IBs! [276916.552577] [drm] Skip scheduling IBs! [276916.572260] amdgpu_cs_ioctl: 1222 callbacks suppressed [276916.572262] [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! [276916.677734] audit: type=1107 audit(1699947763.563:2239): pid=1582 uid=100 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_signal" bus="system" path="/org/freedesktop/login1" interface="org.freedesktop.DBus.Properties" member="PropertiesChanged" name=":1.3" mask="receive" pid=161002 label="snap.firefox.firefox" peer_pid=1620 peer_label="unconfined" exe="/usr/bin/dbus-daemon" sauid=100 hostname=? addr=? terminal=?' [276916.683735] audit: type=1107 audit(1699947763.571:2240): pid=1582 uid=100 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_signal" bus="system" path="/org/freedesktop/login1" interface="org.freedesktop.DBus.Properties" member="PropertiesChanged" name=":1.3" mask="receive" pid=161002 label="snap.firefox.firefox" peer_pid=1620 peer_label="unconfined" exe="/usr/bin/dbus-daemon" sauid=100 hostname=? addr=? terminal=?' [276916.689269] audit: type=1107 audit(1699947763.575:2241): pid=1582 uid=100 auid=4294967295 ses=4294967295 subj=unconfined msg='apparmor="DENIED" operation="dbus_signal" bus="system" path="/org/freedesktop/login1" interface="org.freedesktop.DBus.Properties" member="PropertiesChanged" name=":1.3" mask="receive" pid=161002 label="snap.firefox.firefox" peer_pid=1620 peer_label="unconfined" exe="/usr/bin/dbus-daemon" sauid=100 hostname=? addr=? terminal=?' [276917.222300] ------------[ cut here ]------------ [276917.222302] WARNING: CPU: 0 PID: 525180 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:600 amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.222496] Modules linked in: ch341 usbserial exfat uas usb_storage rfcomm cmac algif_hash algif_skcipher af_alg bnep nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel snd_seq_dummy snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink bridge stp llc overlay sunrpc binfmt_misc nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel btusb snd_intel_dspcfg btrtl snd_usb_audio snd_intel_sdw_acpi snd_hda_codec btbcm snd_usbmidi_lib snd_hda_core intel_rapl_msr btintel mc snd_hwdep intel_rapl_common btmtk input_leds joydev snd_pcm bluetooth edac_mce_amd ecdh_generic snd_seq_midi ecc snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device kvm snd_timer irqbypass snd eeepc_wmi [276917.222575] soundcore rapl wmi_bmof ccp k10temp mac_hid msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 amdgpu iommu_v2 drm_buddy gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_display_helper cec rc_core drm_kms_helper syscopyarea sysfillrect sysimgblt mfd_aaeon asus_wmi hid_generic ledtrig_audio drm crct10dif_pclmul sparse_keymap crc32_pclmul usbhid hid polyval_clmulni polyval_generic ghash_clmulni_intel platform_profile video sha512_ssse3 aesni_intel crypto_simd cryptd igc ahci i2c_piix4 xhci_pci libahci xhci_pci_renesas wmi gpio_amdpt [276917.222630] CPU: 0 PID: 525180 Comm: kworker/0:1 Not tainted 6.2.0-36-generic #37-Ubuntu [276917.222633] Hardware name: ASUS System Product Name/ROG STRIX B550-A GAMING, BIOS 3404 10/07/2023 [276917.222634] Workqueue: events drm_mode_rmfb_work_fn [drm] [276917.222660] RIP: 0010:amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.222813] Code: 31 f6 31 ff e9 72 b5 4b ee 44 89 e2 48 89 de 4c 89 f7 e8 94 fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 51 b5 4b ee <0f> 0b b8 ea ff ff ff eb c3 b8 fe ff ff ff eb bc 90 90 90 90 90 90 [276917.222815] RSP: 0018:ffffabf6a102b880 EFLAGS: 00010046 [276917.222818] RAX: 0000000000000000 RBX: ffff9109e62865b8 RCX: 0000000000000000 [276917.222819] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [276917.222821] RBP: ffffabf6a102b8a0 R08: 0000000000000000 R09: 0000000000000000 [276917.222822] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [276917.222823] R13: 0000000000000001 R14: ffff9109e6280000 R15: ffff910b5f4c7e00 [276917.222825] FS: 0000000000000000(0000) GS:ffff9110cea00000(0000) knlGS:0000000000000000 [276917.222827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [276917.222828] CR2: 00007f6fbf4d3048 CR3: 0000000121252000 CR4: 0000000000750ef0 [276917.222830] PKRU: 55555554 [276917.222832] Call Trace: [276917.222833] <TASK> [276917.222836] ? show_regs+0x6d/0x80 [276917.222841] ? __warn+0x89/0x160 [276917.222845] ? amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.222993] ? report_bug+0x17e/0x1b0 [276917.222997] ? handle_bug+0x46/0x90 [276917.223001] ? exc_invalid_op+0x18/0x80 [276917.223003] ? asm_exc_invalid_op+0x1b/0x20 [276917.223009] ? amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.223153] dm_set_vblank+0x195/0x1c0 [amdgpu] [276917.223370] dm_disable_vblank+0x10/0x20 [amdgpu] [276917.223528] drm_vblank_disable_and_save+0xe2/0x120 [drm] [276917.223544] ? srso_alias_return_thunk+0x5/0x7f [276917.223548] drm_crtc_vblank_off+0xe0/0x290 [drm] [276917.223564] manage_dm_interrupts+0xa9/0xd0 [amdgpu] [276917.223722] amdgpu_dm_atomic_commit_tail+0x163/0x13e0 [amdgpu] [276917.223879] ? srso_alias_return_thunk+0x5/0x7f [276917.223882] ? dcn30_internal_validate_bw+0xea/0xf10 [amdgpu] [276917.224041] ? srso_alias_return_thunk+0x5/0x7f [276917.224044] ? __kmem_cache_alloc_node+0x19f/0x340 [276917.224047] ? dcn30_validate_bandwidth+0x7b/0x380 [amdgpu] [276917.224197] ? dcn30_validate_bandwidth+0x14d/0x380 [amdgpu] [276917.224345] ? srso_alias_return_thunk+0x5/0x7f [276917.224347] ? kfree+0x78/0x120 [276917.224350] ? srso_alias_return_thunk+0x5/0x7f [276917.224352] ? dcn30_validate_bandwidth+0x14d/0x380 [amdgpu] [276917.224499] ? __ww_mutex_lock_slowpath+0x16/0x30 [276917.224503] ? srso_alias_return_thunk+0x5/0x7f [276917.224506] ? dma_resv_iter_first_unlocked+0x66/0x80 [276917.224509] ? srso_alias_return_thunk+0x5/0x7f [276917.224511] ? dma_resv_get_fences+0x5e/0x240 [276917.224514] ? srso_alias_return_thunk+0x5/0x7f [276917.224516] ? dma_resv_get_singleton+0x42/0x150 [276917.224519] ? srso_alias_return_thunk+0x5/0x7f [276917.224521] ? wait_for_completion_timeout+0x119/0x150 [276917.224523] ? srso_alias_return_thunk+0x5/0x7f [276917.224526] ? srso_alias_return_thunk+0x5/0x7f [276917.224529] commit_tail+0xc2/0x190 [drm_kms_helper] [276917.224538] ? srso_alias_return_thunk+0x5/0x7f [276917.224540] ? drm_atomic_helper_swap_state+0x246/0x380 [drm_kms_helper] [276917.224548] drm_atomic_helper_commit+0x11d/0x150 [drm_kms_helper] [276917.224555] drm_atomic_commit+0x99/0xd0 [drm] [276917.224569] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [276917.224586] atomic_remove_fb+0x2fd/0x380 [drm] [276917.224605] drm_framebuffer_remove+0x6b/0x1f0 [drm] [276917.224619] drm_mode_rmfb_work_fn+0x6f/0xa0 [drm] [276917.224632] process_one_work+0x225/0x430 [276917.224636] worker_thread+0x1f6/0x3e0 [276917.224638] ? srso_alias_return_thunk+0x5/0x7f [276917.224640] ? __pfx_worker_thread+0x10/0x10 [276917.224642] kthread+0xe9/0x110 [276917.224645] ? __pfx_kthread+0x10/0x10 [276917.224648] ret_from_fork+0x2c/0x50 [276917.224652] </TASK> [276917.224653] ---[ end trace 0000000000000000 ]--- [276917.224692] ------------[ cut here ]------------ [276917.224693] WARNING: CPU: 0 PID: 525180 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:600 amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.224811] Modules linked in: ch341 usbserial exfat uas usb_storage rfcomm cmac algif_hash algif_skcipher af_alg bnep nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel snd_seq_dummy snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables libcrc32c nfnetlink bridge stp llc overlay sunrpc binfmt_misc nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel btusb snd_intel_dspcfg btrtl snd_usb_audio snd_intel_sdw_acpi snd_hda_codec btbcm snd_usbmidi_lib snd_hda_core intel_rapl_msr btintel mc snd_hwdep intel_rapl_common btmtk input_leds joydev snd_pcm bluetooth edac_mce_amd ecdh_generic snd_seq_midi ecc snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device kvm snd_timer irqbypass snd eeepc_wmi [276917.224866] soundcore rapl wmi_bmof ccp k10temp mac_hid msr parport_pc ppdev lp parport efi_pstore dmi_sysfs ip_tables x_tables autofs4 amdgpu iommu_v2 drm_buddy gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_display_helper cec rc_core drm_kms_helper syscopyarea sysfillrect sysimgblt mfd_aaeon asus_wmi hid_generic ledtrig_audio drm crct10dif_pclmul sparse_keymap crc32_pclmul usbhid hid polyval_clmulni polyval_generic ghash_clmulni_intel platform_profile video sha512_ssse3 aesni_intel crypto_simd cryptd igc ahci i2c_piix4 xhci_pci libahci xhci_pci_renesas wmi gpio_amdpt [276917.224903] CPU: 0 PID: 525180 Comm: kworker/0:1 Tainted: G W 6.2.0-36-generic #37-Ubuntu [276917.224905] Hardware name: ASUS System Product Name/ROG STRIX B550-A GAMING, BIOS 3404 10/07/2023 [276917.224906] Workqueue: events drm_mode_rmfb_work_fn [drm] [276917.224922] RIP: 0010:amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.225033] Code: 31 f6 31 ff e9 72 b5 4b ee 44 89 e2 48 89 de 4c 89 f7 e8 94 fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 51 b5 4b ee <0f> 0b b8 ea ff ff ff eb c3 b8 fe ff ff ff eb bc 90 90 90 90 90 90 [276917.225035] RSP: 0018:ffffabf6a102b880 EFLAGS: 00010046 [276917.225036] RAX: 0000000000000000 RBX: ffff9109e62865b8 RCX: 0000000000000000 [276917.225038] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [276917.225039] RBP: ffffabf6a102b8a0 R08: 0000000000000000 R09: 0000000000000000 [276917.225040] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [276917.225041] R13: 0000000000000001 R14: ffff9109e6280000 R15: ffff910b5f4c7800 [276917.225042] FS: 0000000000000000(0000) GS:ffff9110cea00000(0000) knlGS:0000000000000000 [276917.225043] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [276917.225044] CR2: 00007f6fbf4d3048 CR3: 0000000121252000 CR4: 0000000000750ef0 [276917.225046] PKRU: 55555554 [276917.225046] Call Trace: [276917.225047] <TASK> [276917.225048] ? show_regs+0x6d/0x80 [276917.225051] ? __warn+0x89/0x160 [276917.225054] ? amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.225163] ? report_bug+0x17e/0x1b0 [276917.225166] ? handle_bug+0x46/0x90 [276917.225168] ? exc_invalid_op+0x18/0x80 [276917.225170] ? asm_exc_invalid_op+0x1b/0x20 [276917.225174] ? amdgpu_irq_put+0x9f/0xb0 [amdgpu] [276917.225282] dm_set_vblank+0x195/0x1c0 [amdgpu] [276917.225443] dm_disable_vblank+0x10/0x20 [amdgpu] [276917.225600] drm_vblank_disable_and_save+0xe2/0x120 [drm] [276917.225616] ? srso_alias_return_thunk+0x5/0x7f [276917.225619] drm_crtc_vblank_off+0xe0/0x290 [drm] [276917.225635] manage_dm_interrupts+0xa9/0xd0 [amdgpu] [276917.225793] amdgpu_dm_atomic_commit_tail+0x163/0x13e0 [amdgpu] [276917.225948] ? srso_alias_return_thunk+0x5/0x7f [276917.225951] ? dcn30_internal_validate_bw+0xea/0xf10 [amdgpu] [276917.226107] ? srso_alias_return_thunk+0x5/0x7f [276917.226109] ? __kmem_cache_alloc_node+0x19f/0x340 [276917.226112] ? dcn30_validate_bandwidth+0x7b/0x380 [amdgpu] [276917.226264] ? dcn30_validate_bandwidth+0x14d/0x380 [amdgpu] [276917.226414] ? srso_alias_return_thunk+0x5/0x7f [276917.226416] ? kfree+0x78/0x120 [276917.226418] ? srso_alias_return_thunk+0x5/0x7f [276917.226420] ? dcn30_validate_bandwidth+0x14d/0x380 [amdgpu] [276917.226567] ? __ww_mutex_lock_slowpath+0x16/0x30 [276917.226571] ? srso_alias_return_thunk+0x5/0x7f [276917.226573] ? dma_resv_iter_first_unlocked+0x66/0x80 [276917.226575] ? srso_alias_return_thunk+0x5/0x7f [276917.226577] ? dma_resv_get_fences+0x5e/0x240 [276917.226580] ? srso_alias_return_thunk+0x5/0x7f [276917.226582] ? dma_resv_get_singleton+0x42/0x150 [276917.226585] ? srso_alias_return_thunk+0x5/0x7f [276917.226587] ? wait_for_completion_timeout+0x119/0x150 [276917.226589] ? srso_alias_return_thunk+0x5/0x7f [276917.226592] ? srso_alias_return_thunk+0x5/0x7f [276917.226595] commit_tail+0xc2/0x190 [drm_kms_helper] [276917.226603] ? srso_alias_return_thunk+0x5/0x7f [276917.226605] ? drm_atomic_helper_swap_state+0x246/0x380 [drm_kms_helper] [276917.226613] drm_atomic_helper_commit+0x11d/0x150 [drm_kms_helper] [276917.226620] drm_atomic_commit+0x99/0xd0 [drm] [276917.226634] ? __pfx___drm_printfn_info+0x10/0x10 [drm] [276917.226651] atomic_remove_fb+0x2fd/0x380 [drm] [276917.226670] drm_framebuffer_remove+0x6b/0x1f0 [drm] [276917.226684] drm_mode_rmfb_work_fn+0x6f/0xa0 [drm] [276917.226697] process_one_work+0x225/0x430 [276917.226700] worker_thread+0x1f6/0x3e0 [276917.226702] ? srso_alias_return_thunk+0x5/0x7f [276917.226705] ? __pfx_worker_thread+0x10/0x10 [276917.226707] kthread+0xe9/0x110 [276917.226709] ? __pfx_kthread+0x10/0x10 [276917.226712] ret_from_fork+0x2c/0x50 [276917.226716] </TASK> [276917.226717] ---[ end trace 0000000000000000 ]---
Collapse replies How has your RAM testing been?
Edited by Jacob Mungle
- Alexander Koskovich mentioned in issue #2997 (closed)
mentioned in issue #2997 (closed)
- Bohdan Trach mentioned in issue mesa/mesa#10216 (closed)
mentioned in issue mesa/mesa#10216 (closed)
I have this issue on a 6600 XT, and Ryzen 5600X.
amdgpu.mcbp=0
seemed to fix it for me at first, but after a few days the crash occurred again. This issue is nearly impossible to reproduce easily as it seems to be completely random when it happensjournalctl
Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: in page starting at address 0x00000000fd812000 from client 0x1b (UTCL2) Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00201031 Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: MORE_FAULTS: 0x1 Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: WALKER_ERROR: 0x0 Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: PERMISSION_FAULTS: 0x3 Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: MAPPING_ERROR: 0x0 Dec 08 13:45:59 arches kernel: amdgpu 0000:08:00.0: amdgpu: RW: 0x0
Report details
- Date generated: 2023-12-08 13:59:48
Hardware Information:
- Hardware Model: ASUSTeK COMPUTER INC. TUF B450M-PLUS GAMING
- Memory: 16.0 GiB
- Processor: AMD Ryzen™ 5 5600X × 12
- Graphics: AMD Radeon™ RX 6600 XT
- Disk Capacity: 1.5 TB
Software Information:
- Firmware Version: 3802
- OS Name: Arch Linux
- OS Type: 64-bit
- GNOME Version: 45.2
- Windowing System: Wayland
- Kernel Version: Linux 6.6.4-arch1-1
Edited by Xylight- Ghost User mentioned in issue #3032
mentioned in issue #3032
I am facing the same hangs, usually after playing a few hours or minutes of a game the GPU resets. Trying to get it reproduced let me to 100% reproducer by running furmark with:
furmark --benchmark --demo furmark-knot-vk --p1440
Or with
1080
.This usually crashes the GPU within 30 seconds.
Software Information
OS Name: Arch Linux Kernel: 6.8.5-arch1-1 Mesa: 24.0.5-1
I have not had hangs before but having it now more and more often. To verify it is not a hardware issue I booted into Windows, ran the same furmark benchmarks and they all passed without hangs.
Dmesg
[ 327.641163] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=20333, emitted seq=20335 [ 327.641480] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process furmark pid 5354 thread furmark pid 5356 [ 327.641768] amdgpu 0000:09:00.0: amdgpu: GPU reset begin! [ 327.846706] amdgpu 0000:09:00.0: amdgpu: MODE1 reset [ 327.846714] amdgpu 0000:09:00.0: amdgpu: GPU mode1 reset [ 327.846790] amdgpu 0000:09:00.0: amdgpu: GPU smu mode1 reset [ 328.351511] amdgpu 0000:09:00.0: amdgpu: GPU reset succeeded, trying to resume [ 328.351855] [drm] PCIE GART of 512M enabled (table at 0x0000008001300000). [ 328.351953] [drm] VRAM is lost due to GPU reset! [ 328.351955] [drm] PSP is resuming... [ 328.429147] [drm] reserve 0xa00000 from 0x82fd000000 for PSP TMR [ 328.528931] amdgpu 0000:09:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 328.542498] amdgpu 0000:09:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [ 328.542500] amdgpu 0000:09:00.0: amdgpu: SMU is resuming... [ 328.542504] amdgpu 0000:09:00.0: amdgpu: smu driver if version = 0x0000000e, smu fw if version = 0x00000012, smu fw program = 0, version = 0x00413e00 (65.62.0) [ 328.542507] amdgpu 0000:09:00.0: amdgpu: SMU driver if version not matched [ 328.542541] amdgpu 0000:09:00.0: amdgpu: use vbios provided pptable [ 328.599477] amdgpu 0000:09:00.0: amdgpu: SMU is resumed successfully! [ 328.600760] [drm] DMUB hardware initialized: version=0x02020020 [ 328.722638] [drm] kiq ring mec 2 pipe 1 q 0 [ 328.725495] [drm] VCN decode and encode initialized successfully(under DPG Mode). [ 328.725832] [drm] JPEG decode initialized successfully. [ 328.725849] amdgpu 0000:09:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [ 328.725850] amdgpu 0000:09:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 328.725852] amdgpu 0000:09:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 328.725853] amdgpu 0000:09:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 [ 328.725854] amdgpu 0000:09:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 [ 328.725855] amdgpu 0000:09:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 [ 328.725856] amdgpu 0000:09:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 [ 328.725857] amdgpu 0000:09:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 [ 328.725859] amdgpu 0000:09:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 [ 328.725860] amdgpu 0000:09:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 11 on hub 0 [ 328.725861] amdgpu 0000:09:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [ 328.725862] amdgpu 0000:09:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 [ 328.725863] amdgpu 0000:09:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8 [ 328.725865] amdgpu 0000:09:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8 [ 328.725866] amdgpu 0000:09:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8 [ 328.725867] amdgpu 0000:09:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8 [ 328.728648] amdgpu 0000:09:00.0: amdgpu: recover vram bo from shadow start [ 328.732615] amdgpu 0000:09:00.0: amdgpu: recover vram bo from shadow done [ 328.732629] amdgpu 0000:09:00.0: amdgpu: GPU reset(2) succeeded! [ 328.732630] [drm] Skip scheduling IBs!
Collapse replies Tried the Arch Linux LTs kernel which is 6.6, also crashes. Tried furmark on Ubuntu 23.10 with Linux 6.5 and mesa: 23.2. First few tries went ok and then it consistenly crashed again.
In windows, I noticed the RAM peaks at 1990MHz and core peaks at 2545 MHz. Details from AMD's tool on windows: https://i.imgur.com/jdG3PdK.png
Compared to the values in CoreCTL:
Max power: 186 Watt (Windows peaked at 166 Watt) GPU 500-2589 MHz (Windows peaks at 2545 MHz) Memory Max: 1000 MHz (compared to 1990 MHz on windows)
I could ofcourse bump the max in Linux but I am not sure if that can inflict hardware damage.
Furmark (vk) is stable at 720p and maxes at out 975mV, 185 Watt, 1000Mhz Memory and 2318 MHz it varies.
Edited by Jelle van der WaaSo it turns out that in Linux I apparently have to multiple the vram frequency by 2, so it runs 100 MHz higher then in Windows.
So I have set the max to 950 MHz on Linux with corectl and so far knocks wood it runs without crashes.
You might actually be my savior. I just tested it and can't get it to crash with Furmark, but the system starts behaving very strangely as if it's about to freeze and crash. Turns out it set the GPU frequency of my 6800 XT to over 2400 Mhz which is quite a bit higher than the manufacturer-specified boost clock. But it only does this in certain situations. The settings in CoreCtrl seem very strange, but it is not a CoreCtrl issue, since I had this happening before I installed it. Back then I didn't confirm if the frequency is always right, since it's impossible to monitor when in fullscreen. But I can't manually set it to the correct values. Or more like I can, but they get ignored and way lower values are used instead. The profiles also seem bugged. Energy saving has quite the opposite effect. 3d fullscreen has a way too low frequency, but VR seems fine. Will play like this for a while and see if it's stable now. And I was always wondering why my edge temperature was so high when I have water-cooling...
I am having the same issue. I tested
furmark
for ~5min in 1080p, it did not crash in that timeframe though. Generally, it happens randomly, even during non-GPU-demanding workloads. My system stats are very similar to #2943 (comment 2199629) - But it says 0x00101031 instead of 0x00201031 for me afterGCVM_L2_PROTECTION_FAULT_STATUS:
.