7900XTX - Firefox - "MES failed to response", "failed to unmap legacy queue" GPU hang
Brief summary of the problem:
Opening or just surfing using Firefox leads to a sporadic GPU hang. This hang happens independent of decoding videos (media.hardware-video-decoding.enabled
true
and false
tested). I've not seen the error message MES failed to response
and failed to unmap legacy queue
reported before, please correct me if I'm wrong.
Hardware description:
- CPU: Ryzen 3700x
- GPU: RX7900XTX
- System Memory: 32GB
- Display(s): 1x 3440x1440x100Hz
- Type of Display Connection: DP
System information:
- Distro name and Version: Fedora 38
- Kernel version: 6.2.0 (latest mainline)
- Custom kernel: Fedora stock kernel
- AMD official driver version:
mesa-23.0.0~rc4-3.fc38.x86_64
- Firefox version:
firefox-110.0-3.fc38.x86_64
How to reproduce the issue:
Firefox is the only application I've seen this specific hang with. Just surfing or opening/closing Firefox is enough. Usually happens after 5 to 20 minutes of usage.
Log files (for system lockups / game freezes / crashes)
(removed repeating lines)
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=4132, emitted seq=4134
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process firefox pid 3320 thread firefox:cs0 pid 3399
amdgpu 0000:0a:00.0: amdgpu: GPU reset begin!
amdgpu 0000:0a:00.0: amdgpu: IP block:gfx_v11_0 is hung!
amdgpu 0000:0a:00.0: amdgpu: soft reset failed, will fallback to full reset!
[drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
amdgpu 0000:0a:00.0: amdgpu: MODE1 reset
amdgpu 0000:0a:00.0: amdgpu: GPU mode1 reset
amdgpu 0000:0a:00.0: amdgpu: GPU smu mode1 reset
amdgpu 0000:0a:00.0: amdgpu: GPU reset succeeded, trying to resume
[drm] PCIE GART of 512M enabled (table at 0x00000085FEB00000).
[drm] VRAM is lost due to GPU reset!
[drm] PSP is resuming...
[drm] reserve 0x1300000 from 0x85fc000000 for PSP TMR
amdgpu 0000:0a:00.0: amdgpu: RAP: optional rap ta ucode is not available
amdgpu 0000:0a:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
amdgpu 0000:0a:00.0: amdgpu: SMU is resuming...
amdgpu 0000:0a:00.0: amdgpu: smu driver if version = 0x00000037, smu fw if version = 0x00000034, smu fw program = 0, smu fw version = 0x004e4b00 (78.75.0)
amdgpu 0000:0a:00.0: amdgpu: SMU driver if version not matched
amdgpu 0000:0a:00.0: amdgpu: SMU is resumed successfully!
[drm] DMUB hardware initialized: version=0x07000A01
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:91
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:99
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:107
[drm] REG_WAIT timeout 1us * 1000 tries - dcn32_dsc_pg_control line:115
[drm] kiq ring mec 3 pipe 1 q 0
[drm] VCN decode and encode initialized successfully(under DPG Mode).
amdgpu 0000:0a:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
amdgpu 0000:0a:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0
amdgpu 0000:0a:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 1
amdgpu 0000:0a:00.0: amdgpu: ring vcn_unified_1 uses VM inv eng 1 on hub 1
amdgpu 0000:0a:00.0: amdgpu: ring jpeg_dec uses VM inv eng 4 on hub 1
amdgpu 0000:0a:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 14 on hub 0
amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow start
amdgpu 0000:0a:00.0: amdgpu: recover vram bo from shadow done
[drm] Skip scheduling IBs!
[drm] ring gfx_32776.1.1 was added
[drm] ring compute_32776.2.2 was added
[drm] ring compute_32776.2.3 was added
[drm] ring compute_32776.2.4 was added
[drm] ring compute_32776.2.5 was added
[drm] ring sdma_32776.3.6 was added
[drm] ring sdma_32776.3.7 was added
[drm] ring gfx_32776.1.1 test pass
[drm] ring gfx_32776.1.1 ib test pass
[drm] ring compute_32776.2.2 test pass
[drm] ring compute_32776.2.2 ib test pass
[drm] ring compute_32776.2.3 test pass
[drm] ring compute_32776.2.3 ib test pass
[drm] ring compute_32776.2.4 test pass
[drm] ring compute_32776.2.4 ib test pass
[drm] ring compute_32776.2.5 test pass
[drm] ring compute_32776.2.5 ib test pass
[drm] ring sdma_32776.3.6 test pass
[drm] ring sdma_32776.3.6 ib test pass
[drm] ring sdma_32776.3.7 test pass
[drm] ring sdma_32776.3.7 ib test pass
amdgpu 0000:0a:00.0: amdgpu: GPU reset(2) succeeded!