Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Register
  • Sign in
  • A amd
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 369
    • Issues 369
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • drm
  • amd
  • Issues
  • #2447
Closed
Open
Issue created Mar 06, 2023 by Johannes Deger@jaydizzle

[Renoir] [Cezanne] Random Freezes and Black-screen

Brief summary of the problem:

Hi, Since at least Kernel 6.2.1 I experience random freezes with admgpu as soon as I do "heavier" work, e.g rendering google-maps 3D. Usually the system freezes and the a black screen starts flashing. I need to reboot my system in order to get it working again. DRM Logs cycle between these entries:

Mär 06 12:53:59 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=287925, emitted seq=287927
Mär 06 12:53:59 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process chrome pid 6163 thread chrome:cs0 pid 6278
Mär 06 12:53:59 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Mär 06 12:54:00 aurora kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
Mär 06 12:54:00 aurora kernel: amdgpu 0000:07:00.0: amdgpu: MODE2 reset
Mär 06 12:54:00 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
Mär 06 12:54:00 aurora kernel: [drm] PCIE GART of 1024M enabled.
Mär 06 12:54:00 aurora kernel: [drm] PTB located at 0x000000F41FC00000
Mär 06 12:54:00 aurora kernel: [drm] PSP is resuming...
Mär 06 12:54:00 aurora kernel: [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAS: optional ras ta ucode is not available
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAP: optional rap ta ucode is not available
Mär 06 12:54:01 aurora kernel: [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
Mär 06 12:54:01 aurora kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: Secure display: Generic Failure.
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SMU is resuming...
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SMU is resumed successfully!
Mär 06 12:54:01 aurora kernel: [drm] DMUB hardware initialized: version=0x01010026
Mär 06 12:54:01 aurora kernel: [drm] kiq ring mec 2 pipe 1 q 0
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Mär 06 12:54:01 aurora kernel: [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
Mär 06 12:54:01 aurora kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset(2) failed
Mär 06 12:54:01 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset end with ret = -110
Mär 06 12:54:01 aurora kernel: [drm] Skip scheduling IBs!
Mär 06 12:54:01 aurora kernel: [drm] Skip scheduling IBs!
Mär 06 12:54:01 aurora kernel: [drm] Skip scheduling IBs!
Mär 06 12:54:01 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Mär 06 12:54:12 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=44790, emitted seq=44792
Mär 06 12:54:12 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
Mär 06 12:54:12 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Mär 06 12:54:12 aurora kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
Mär 06 12:54:12 aurora kernel: amdgpu 0000:07:00.0: amdgpu: MODE2 reset
Mär 06 12:54:12 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
Mär 06 12:54:12 aurora kernel: [drm] PCIE GART of 1024M enabled.
Mär 06 12:54:12 aurora kernel: [drm] PTB located at 0x000000F41FC00000
Mär 06 12:54:12 aurora kernel: [drm] PSP is resuming...
Mär 06 12:54:13 aurora kernel: [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAS: optional ras ta ucode is not available
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAP: optional rap ta ucode is not available
Mär 06 12:54:13 aurora kernel: [drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
Mär 06 12:54:13 aurora kernel: [drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: Secure display: Generic Failure.
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SECUREDISPLAY: query securedisplay TA failed. ret 0x0
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SMU is resuming...
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: SMU is resumed successfully!
Mär 06 12:54:13 aurora kernel: [drm] DMUB hardware initialized: version=0x01010026
Mär 06 12:54:13 aurora kernel: [drm] kiq ring mec 2 pipe 1 q 0
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test failed (-110)
Mär 06 12:54:13 aurora kernel: [drm:amdgpu_gfx_enable_kcq.cold [amdgpu]] *ERROR* KCQ enable failed
Mär 06 12:54:13 aurora kernel: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <gfx_v9_0> failed -110
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset(3) failed
Mär 06 12:54:13 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset end with ret = -110
Mär 06 12:54:13 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* GPU Recovery Failed: -110
Mär 06 12:54:24 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=44792, emitted seq=44794
Mär 06 12:54:24 aurora kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process  pid 0 thread  pid 0
Mär 06 12:54:24 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
Mär 06 12:54:24 aurora kernel: [drm] psp gfx command UNLOAD_TA(0x2) failed and response status is (0x117)
Mär 06 12:54:24 aurora kernel: amdgpu 0000:07:00.0: amdgpu: MODE2 reset
Mär 06 12:54:24 aurora kernel: amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
Mär 06 12:54:24 aurora kernel: [drm] PCIE GART of 1024M enabled.
Mär 06 12:54:24 aurora kernel: [drm] PTB located at 0x000000F41FC00000
Mär 06 12:54:24 aurora kernel: [drm] PSP is resuming...
Mär 06 12:54:25 aurora kernel: [drm] reserve 0x400000 from 0xf41f800000 for PSP TMR
Mär 06 12:54:25 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAS: optional ras ta ucode is not available
Mär 06 12:54:25 aurora kernel: amdgpu 0000:07:00.0: amdgpu: RAP: optional rap ta ucode is not available

Right now I am on 6.1.11 and running fine.

Hardware description:

  • CPU: AMD Ryzen 7 PRO 4750U with Radeon Graphics
  • GPU: 07:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Renoir [1002:1636] (rev d1)
  • System Memory: 32G
  • Display(s): 1x 5k LG HDR, 1x1440p iiyama
  • Type of Display Connection: 1x USB-C, 1x USB-C to HDMI.

System information:

  • Distro name and Version: Arch
  • Kernel version: 6.2.2-zen1-1-zen
  • Custom kernel: 6.2.2-zen1-1-zen
  • AMD official driver version: N/A

How to reproduce the issue:

It seems to happen randomly. However, it happens more often performing gpu intense tasks like video-decoding oder rendering google maps 3D-View.

Attached files:

Log files (for system lockups / game freezes / crashes)

JournalCTL: https://pastebin.com/GypLMP02

Assignee
Assign to
Time tracking