[RADV] GPU hangs in Judgment (2058180)
Description
In Judgment (2058180), whenever the camera moves up to the air, there is a high chance that the GPU will hang, sometimes it recovers, sometimes it doesn't and you have to force-reboot, nonetheless you have an unwanted instability happening.
This happens when I play in 4K resolution on my 6900XT system (with medium or very high preset), it doesn't reproduce as much when I play at 1080p and it doesn't reproduce on the Steam Deck either (didn't try pushing it to a 4K resolution as I don't have a Dock).
It is completely reproducible on my end, I either have to:
- play the Baseball mini-game and do a home-run to trigger this (in some sessions baseball runs fine, couldn't figure out why)
- play a bit too long during the investigation phases where you are in First-person view and have to capture evidences by zooming into the environment, and I feel like directing the camera upwards is triggering it more often than not
I managed to reproduce this both on mesa stable (22.3.4-1) and latest mesa-git (mesa-tkg-git) and on my up-to-date Fedora laptop which also has a RDNA 2 GPU (6800U).
UPDATE
It seems to be kernel related as I've only managed to get away from this bug by installing linux-neptune on my Desktop system, and could explain why the Steam Deck doesn't have this issue (both on linux-neptune and on linux-neptune-61).
Screenshots/video files
Baseball minigame hang | Investigation mode hang |
---|---|
video | |
Log files (for system lockups / game freezes / crashes)
- Output of
dmesg
:
[sam. févr. 11 18:19:57 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:20:07 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:20:11 2023] kauditd_printk_skb: 1 callbacks suppressed
[sam. févr. 11 18:20:11 2023] audit: type=1131 audit(1676136012.811:166): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-timedated comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[sam. févr. 11 18:20:11 2023] audit: type=1334 audit(1676136012.828:167): prog-id=0 op=UNLOAD
[sam. févr. 11 18:20:11 2023] audit: type=1334 audit(1676136012.828:168): prog-id=0 op=UNLOAD
[sam. févr. 11 18:20:11 2023] audit: type=1334 audit(1676136012.828:169): prog-id=0 op=UNLOAD
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800046200000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:06 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:22:16 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800044600000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:16 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:22:26 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800051400000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:26 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:22:36 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x000080005c200000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:36 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:22:46 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x000080002c600000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:46 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:22:56 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800039400000 from client 0x1b (UTCL2)
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:22:56 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:23:06 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800046200000 from client 0x1b (UTCL2)
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:23:06 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:23:16 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800044600000 from client 0x1b (UTCL2)
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:23:16 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:23:26 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800052600000 from client 0x1b (UTCL2)
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:23:33 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
[sam. févr. 11 18:23:43 2023] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, but soft recovered
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:6 pasid:32780, for process timing pid 3954 thread timing pid 4000)
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: in page starting at address 0x0000800051400000 from client 0x1b (UTCL2)
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00601430
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: Faulty UTCL2 client ID: SQC (data) (0xa)
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: MORE_FAULTS: 0x0
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: WALKER_ERROR: 0x0
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: PERMISSION_FAULTS: 0x3
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: MAPPING_ERROR: 0x0
[sam. févr. 11 18:23:43 2023] amdgpu 0000:0c:00.0: amdgpu: RW: 0x0
- RADV dump (
RADV_DEBUG=hang
) : radv_dumps.tar.gz
Steps to reproduce
Extract judgment_save_files.tar.gz in the prefixes' %APPDATA%
, you might need to rename the Sega/Judgment/Steam/96957176
folder to your own steam account ID, or create a new game save yourself and copy the Sega/Judgment/Steam/96957176/*
files afterwards.
Set the game to run at 4K resolution, with either medium or very high preset.
To reproduce the Baseball issue (not 100% reproducible):
- Select Continue in the main menu and load save number 03
- Move the character towards the door in front of you and start the mini-game (press X or enter)
- Select the Challenge mode aka the right option in the selection ring and choose the first set
- Do a home-run by reproducing this: https://youtu.be/E3dJF9_8ZUg?t=54
To reproduce the other issue (always reproducible on my end):
- Select Continue in the main menu and load save number 06
- Move the character forward, towards in the small hallway (follow the red pole in the mini-map displayed in the bottom left) until you're interrupted by the investigation sequence
- Move around, zoom in and out look up to the sky until the hanging begins
System information
inxi -GSC -xx
output:
System:
Host: nzxt-arch Kernel: 6.0.0-1-hdr-gd359b9dc0f00 arch: x86_64 bits: 64
compiler: gcc v: 12.2.0 Console: pty pts/0 DM: LightDM Distro: Arch Linux
CPU:
Info: 12-core model: AMD Ryzen 9 3900X bits: 64 type: MT MCP arch: Zen 2
rev: 0 cache: L1: 768 KiB L2: 6 MiB L3: 64 MiB
Speed (MHz): avg: 2984 high: 4241 min/max: 2200/4672 boost: enabled
cores: 1: 3830 2: 4210 3: 2075 4: 3923 5: 3576 6: 3597 7: 2195 8: 2200
9: 2200 10: 2067 11: 4072 12: 2120 13: 3615 14: 3708 15: 2121 16: 3606
17: 4241 18: 3684 19: 2197 20: 2195 21: 2196 22: 2118 23: 3782 24: 2100
bogomips: 182135
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3
svm
Graphics:
Device-1: AMD Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
vendor: Micro-Star MSI driver: amdgpu v: kernel arch: RDNA-2 pcie:
speed: 16 GT/s lanes: 16 ports: active: HDMI-A-1 off: DP-2
empty: DP-1,DP-3 bus-ID: 0c:00.0 chip-ID: 1002:73bf
Device-2: Sunplus Innovation FULL HD webcam type: USB
driver: snd-usb-audio,uvcvideo bus-ID: 5-2.4.2:6 chip-ID: 1bcf:2283
Display: wayland server: X.org v: 1.21.1.7 with: Xwayland v: 22.1.7
compositor: Gamescope driver: X: loaded: amdgpu unloaded: modesetting
alternate: fbdev,vesa dri: radeonsi gpu: amdgpu tty: 77x40
Monitor-1: DP-2 model: Dell S3220DGF res: 2560x1440 dpi: 93
diag: 806mm (31.7")
Monitor-2: HDMI-A-1 model: LG (GoldStar) TV SSCR2 res: 3840x2160
dpi: 61 diag: 1836mm (72.3")
API: EGL/GBM Message: No known Wayland EGL/GBM data sources.
If applicable
- Wine/Proton version: latest Proton Experimental