Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • A amd
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,471
    • Issues 1,471
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Container Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • drm
  • amd
  • Issues
  • #1736
Closed
Open
Issue created Oct 08, 2021 by Dan Horák@sharkcz

[bisected] driver crash in 5.15-rc1 on Radeon WX4100 when in power saving mode

Brief summary of the problem:

When resuming from a power saving mode (activated earlier by a screensaver), there is a crash in the driver, the display shows garbage (or stays "off"). It first appeared in 5.15-rc1 build and then got reproduced it with the rc[234]. Nothing like that was observed in 5.13.x or 5.14.0.

A workaround is to use "amdgpu.runpm=0" on the kernel parameter line.

...

Sep 17 09:22:50 talos.danny.cz kernel: amdgpu 0000:01:00.0: refused to change power state from D0 to D3hot
Sep 17 09:22:51 talos.danny.cz kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
Sep 17 09:22:51 talos.danny.cz kernel: [drm] UVD and UVD ENC initialized successfully.
Sep 17 09:22:52 talos.danny.cz kernel: [drm] VCE initialized successfully.
Sep 17 09:23:08 talos.danny.cz kernel: amdgpu 0000:01:00.0: refused to change power state from D0 to D3hot
Sep 17 09:23:11 talos.danny.cz kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400000000).
Sep 17 09:23:11 talos.danny.cz kernel: [drm] UVD and UVD ENC initialized successfully.
Sep 17 09:23:11 talos.danny.cz kernel: [drm] VCE initialized successfully.
Sep 17 09:23:11 talos.danny.cz kernel: amdgpu 0000:01:00.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-22).
Sep 17 09:23:11 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu: couldn't schedule ib on ring <sdma0>
Sep 17 09:23:11 talos.danny.cz kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
Sep 17 09:23:11 talos.danny.cz kernel: [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-22).
Sep 17 09:23:11 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu: couldn't schedule ib on ring <sdma0>
Sep 17 09:23:11 talos.danny.cz kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
Sep 17 09:23:11 talos.danny.cz kernel: amdgpu 0000:01:00.0: amdgpu: couldn't schedule ib on ring <sdma0>
Sep 17 09:23:11 talos.danny.cz kernel: [drm:amdgpu_job_run [amdgpu]] *ERROR* Error scheduling IBs (-22)
...

The "refused to change power state" messages seem to be harmless(?). Unfortunately I haven't found yet, what exactly triggers the crash.

I have started the bisecting process at the amd-drm-next-5.15-2021-09-01 tag hoping it should be caused by a change in the drm-next-5.15 branch between v5.14-rc3 tag and "today".

Hardware description:

  • CPU: IBM Power9
  • GPU: Radeon Pro WX 4100
  • System Memory: 64GB
  • Display(s): Dell U2412M
  • Type of Display Connection: DP

System information:

  • Distro name and Version: Fedora 34
  • Kernel version: kernel-5.15.0-0.rc1.12.fc36.ppc64le
  • Custom kernel: N/A
  • AMD package version: N/A

How to reproduce the issue:

  • activate screensaver and let the monitor enter power saving mode
  • wait for an unknown amount of time
  • leave the power saving mode by moving a mouse/pressing a key

Attached files:

amdgpu-5.15-rc1.log Xorg.0.log.old

Edited Oct 27, 2021 by Dan Horák
Assignee
Assign to
Time tracking