amdgpu gnome-shell often crash AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics
Hi
The Gnome of my laptop HP Elitebook 845 G10 with AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics is closed every 5mn in videoconferencing (webex, bbb...) or 30mn with Chromium videos or 2h instead (so it's very difficult to work)
I tried all kernels (6.2->6.7). The less worst are 6.4.0 (mainline) and 6.5.0-generic I tried https://gitlab.com/kernel-firmware/linux-firmware I've upgraded from Ubuntu 22.04 to 23.10 2 days ago
I tried with amdgpu.sg_display=0 and amdgpu.dcdebugmask=0x10 amdgpu.noretry=0
At every close, there are these lines in syslog :
2024-02-08T15:41:14.275612+01:00 chambotte kernel: [22865.655355] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled seq=1333399, emitted seq=1333401
2024-02-08T15:41:14.275627+01:00 chambotte kernel: [22865.655525] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process gnome-shell pid 17734 thread gnome-shel:cs0 pid 17761
2024-02-08T15:41:14.275630+01:00 chambotte kernel: [22865.655658] amdgpu 0000:c3:00.0: amdgpu: GPU reset begin!
2024-02-08T15:41:14.591989+01:00 chambotte kernel: [22865.969212] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:14.592005+01:00 chambotte kernel: [22865.969387] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:14.720107+01:00 chambotte kernel: [22866.099425] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:14.720117+01:00 chambotte kernel: [22866.099563] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:14.851993+01:00 chambotte kernel: [22866.229553] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:14.852004+01:00 chambotte kernel: [22866.229689] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:14.979995+01:00 chambotte kernel: [22866.359771] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:14.980006+01:00 chambotte kernel: [22866.359907] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:15.111611+01:00 chambotte kernel: [22866.489879] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:15.111620+01:00 chambotte kernel: [22866.490015] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:15.239620+01:00 chambotte kernel: [22866.620001] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:15.239633+01:00 chambotte kernel: [22866.620141] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:15.371992+01:00 chambotte kernel: [22866.749907] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:15.372010+01:00 chambotte kernel: [22866.750084] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:15.500111+01:00 chambotte kernel: [22866.880134] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:15.500123+01:00 chambotte kernel: [22866.880275] [drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
2024-02-08T15:41:15.631609+01:00 chambotte kernel: [22867.010285] [drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
2024-02-08T15:41:15.839612+01:00 chambotte kernel: [22867.219872] [drm:gfx_v11_0_cp_gfx_enable.isra.0 [amdgpu]] *ERROR* failed to halt cp gfx
2024-02-08T15:41:15.843616+01:00 chambotte kernel: [22867.221367] amdgpu 0000:c3:00.0: amdgpu: MODE2 reset
2024-02-08T15:41:15.871710+01:00 chambotte kernel: [22867.252473] amdgpu 0000:c3:00.0: amdgpu: GPU reset succeeded, trying to resume
2024-02-08T15:41:15.871742+01:00 chambotte kernel: [22867.253144] [drm] PCIE GART of 512M enabled (table at 0x0000008000F00000).
2024-02-08T15:41:15.875616+01:00 chambotte kernel: [22867.253402] [drm] VRAM is lost due to GPU reset!
2024-02-08T15:41:15.875625+01:00 chambotte kernel: [22867.253404] amdgpu 0000:c3:00.0: amdgpu: SMU is resuming...
2024-02-08T15:41:15.875626+01:00 chambotte kernel: [22867.256928] amdgpu 0000:c3:00.0: amdgpu: SMU is resumed successfully!
2024-02-08T15:41:15.879610+01:00 chambotte kernel: [22867.259539] [drm] DMUB hardware initialized: version=0x08003000
2024-02-08T15:41:16.455639+01:00 chambotte kernel: [22867.836724] [drm] kiq ring mec 3 pipe 1 q 0
2024-02-08T15:41:16.460325+01:00 chambotte kernel: [22867.839180] [drm] VCN decode and encode initialized successfully(under DPG Mode).
2024-02-08T15:41:16.460335+01:00 chambotte kernel: [22867.839284] amdgpu 0000:c3:00.0: [drm:jpeg_v4_0_hw_init [amdgpu]] JPEG decode initialized successfully.
2024-02-08T15:41:16.460338+01:00 chambotte kernel: [22867.839798] amdgpu 0000:c3:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
2024-02-08T15:41:16.460339+01:00 chambotte kernel: [22867.839801] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
2024-02-08T15:41:16.460340+01:00 chambotte kernel: [22867.839803] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0
2024-02-08T15:41:16.460341+01:00 chambotte kernel: [22867.839804] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0
2024-02-08T15:41:16.460343+01:00 chambotte kernel: [22867.839806] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0
2024-02-08T15:41:16.460344+01:00 chambotte kernel: [22867.839807] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0
2024-02-08T15:41:16.460344+01:00 chambotte kernel: [22867.839808] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0
2024-02-08T15:41:16.460345+01:00 chambotte kernel: [22867.839810] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0
2024-02-08T15:41:16.460345+01:00 chambotte kernel: [22867.839812] amdgpu 0000:c3:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0
2024-02-08T15:41:16.460346+01:00 chambotte kernel: [22867.839814] amdgpu 0000:c3:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0
2024-02-08T15:41:16.460346+01:00 chambotte kernel: [22867.839815] amdgpu 0000:c3:00.0: amdgpu: ring vcn_unified_0 uses VM inv eng 0 on hub 8
2024-02-08T15:41:16.460347+01:00 chambotte kernel: [22867.839817] amdgpu 0000:c3:00.0: amdgpu: ring jpeg_dec uses VM inv eng 1 on hub 8
2024-02-08T15:41:16.460366+01:00 chambotte kernel: [22867.839818] amdgpu 0000:c3:00.0: amdgpu: ring mes_kiq_3.1.0 uses VM inv eng 13 on hub 0
2024-02-08T15:41:16.460367+01:00 chambotte kernel: [22867.841660] amdgpu 0000:c3:00.0: amdgpu: recover vram bo from shadow start
2024-02-08T15:41:16.460368+01:00 chambotte kernel: [22867.841662] amdgpu 0000:c3:00.0: amdgpu: recover vram bo from shadow done
2024-02-08T15:41:16.460368+01:00 chambotte kernel: [22867.841825] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.460378+01:00 chambotte kernel: [22867.841848] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.460379+01:00 chambotte kernel: [22867.841849] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.460380+01:00 chambotte kernel: [22867.841870] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.460381+01:00 chambotte kernel: [22867.841884] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.460385+01:00 chambotte kernel: [22867.841895] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.463580+01:00 chambotte kernel: [22867.841907] [drm] Skip scheduling IBs!
2024-02-08T15:41:16.463587+01:00 chambotte kernel: [22867.842471] [drm] ring gfx_32787.1.1 was added
2024-02-08T15:41:16.463587+01:00 chambotte kernel: [22867.842892] [drm] ring compute_32787.2.2 was added
2024-02-08T15:41:16.463588+01:00 chambotte kernel: [22867.843298] [drm] ring sdma_32787.3.3 was added
2024-02-08T15:41:16.463588+01:00 chambotte kernel: [22867.843339] [drm] ring gfx_32787.1.1 ib test pass
2024-02-08T15:41:16.463589+01:00 chambotte kernel: [22867.843367] [drm] ring compute_32787.2.2 ib test pass
2024-02-08T15:41:16.463589+01:00 chambotte kernel: [22867.843403] [drm] ring sdma_32787.3.3 ib test pass
2024-02-08T15:41:16.463589+01:00 chambotte kernel: [22867.843935] amdgpu 0000:c3:00.0: amdgpu: GPU reset(4) succeeded!
2024-02-08T15:41:16.463985+01:00 chambotte gnome-shell[17734]: amdgpu: amdgpu_cs_query_fence_status failed.
2024-02-08T15:41:16.464272+01:00 chambotte /usr/libexec/gdm-x-session[17233]: amdgpu: The CS has been rejected (-125). Recreate the context.
2024-02-08T15:41:16.464345+01:00 chambotte /usr/libexec/gdm-x-session[17233]: amdgpu: The CS has been rejected (-125), but the context isn't robust.
2024-02-08T15:41:16.464366+01:00 chambotte /usr/libexec/gdm-x-session[17233]: amdgpu: The process will be terminated.
2024-02-08T15:41:16.478688+01:00 chambotte rocketchat-desktop.desktop[18371]: [18371:0208/154116.477729:ERROR:connection.cc(46)] X connection error received.
2024-02-08T15:41:16.479154+01:00 chambotte at-spi-bus-launcher[17780]: X connection to :1 broken (explicit kill or server shutdown).
2024-02-08T15:41:16.481312+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Wacom.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.483620+01:00 chambotte gnome-shell[17734]: X connection to :1 broken (explicit kill or server shutdown).
2024-02-08T15:41:16.519968+01:00 chambotte pcloud.desktop[22544]: fusermount: /home/fulconis/pCloudDrive not mounted
2024-02-08T15:41:16.533092+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Color.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.534112+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Keyboard.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.534961+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.MediaKeys.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.535660+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Power.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.536440+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.XSettings.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.537514+01:00 chambotte systemd[17056]: xdg-desktop-portal-gnome.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.537668+01:00 chambotte systemd[17056]: xdg-desktop-portal-gnome.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.538006+01:00 chambotte systemd[17056]: xdg-desktop-portal-gtk.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.538151+01:00 chambotte systemd[17056]: xdg-desktop-portal-gtk.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.538661+01:00 chambotte systemd[17056]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE
2024-02-08T15:41:16.538702+01:00 chambotte systemd[17056]: gnome-terminal-server.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.538858+01:00 chambotte systemd[17056]: gnome-terminal-server.service: Consumed 26.126s CPU time.
2024-02-08T15:41:16.539059+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Wacom.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.540400+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Color.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.540801+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Keyboard.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.541127+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.MediaKeys.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.541470+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.Power.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.541859+01:00 chambotte systemd[17056]: org.gnome.SettingsDaemon.XSettings.service: Failed with result 'exit-code'.
2024-02-08T15:41:16.542326+01:00 chambotte systemd[17056]: app-gnome-Nextcloud-18076.scope: Consumed 1.537s CPU time.
2024-02-08T15:41:16.543017+01:00 chambotte systemd[17056]: app-gnome-org.gnome.Software-18099.scope: Consumed 1.810s CPU time.
2024-02-08T15:41:16.545099+01:00 chambotte systemd[17056]: Stopped target gnome-session-x11@ubuntu.target - GNOME X11 Session (session: ubuntu).
2024-02-08T15:41:16.545237+01:00 chambotte systemd[17056]: Stopped target graphical-session.target - Current graphical user session.
2024-02-08T15:41:16.549551+01:00 chambotte systemd[17056]: Stopped update-notifier-release.path - Path trigger for new release of Ubuntu notifications.
2024-02-08T15:41:16.550066+01:00 chambotte systemd[17056]: Stopped target gnome-session.target - GNOME Session.
2024-02-08T15:41:16.550150+01:00 chambotte systemd[17056]: Stopped target gnome-session-x11.target - GNOME X11 Session.
2024-02-08T15:41:16.550207+01:00 chambotte systemd[17056]: Stopped target gnome-session@ubuntu.target - GNOME Session (session: ubuntu).
....
Edited by patrick fulconis