[regression] amdgpu NULL pointer dereference on ppc64le in 5.19-rc7
Brief summary of the problem:
After updating the kernel from rc6 to rc7, the kernel panics with:
[ 3.078328] Kernel attempted to read user page (498) - exploit attempt? (uid: 0)
[ 3.078355] BUG: Kernel NULL pointer dereference on read at 0x00000498
[ 3.078379] Faulting instruction address: 0xc0080000038f06dc
[ 3.078393] Oops: Kernel access of bad area, sig: 11 [#1]
[ 3.078421] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 3.078463] Modules linked in: amdgpu(+) ast nvme mfd_core drm_vram_helper gpu_sched drm_ttm_helper tg3 vmx_crypto drm_display_helper ttm nvme_core crc32c_vpmsum cec scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables pkcs8_key_parser dm_multipath fuse
[ 3.078581] CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 5.19.0-0.rc7.53.fc37.ppc64le #1
[ 3.078612] Workqueue: events work_for_cpu_fn
[ 3.078629] NIP: c0080000038f06dc LR: c0080000038f2b08 CTR: 0000000000000000
[ 3.078677] REGS: c0000000038d3320 TRAP: 0300 Not tainted (5.19.0-0.rc7.53.fc37.ppc64le)
[ 3.078716] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 24002220 XER: 00000000
[ 3.078763] CFAR: c0080000038f06a0 DAR: 0000000000000498 DSISR: 40000000 IRQMASK: 0
[ 3.078763] GPR00: c0080000038f2b08 c0000000038d35c0 c008000003ab8000 c00000000bd00000
[ 3.078763] GPR04: c0000000038d37f0 000000000000000f c000000005daa220 000000000000001a
[ 3.078763] GPR08: c000000005daa220 0000000000000000 0000000000000000 c68d4ef7050f600d
[ 3.078763] GPR12: c0000000004dc4e0 c000000002b40000 c0000000057d0000 c0000000057d7e50
[ 3.078763] GPR16: c0000000057c5da8 c0000000057c5db0 c0000000057c0000 0000000000000000
[ 3.078763] GPR20: c0000000057c5da0 0000000000000100 0000000000000001 0000000000000001
[ 3.078763] GPR24: c0000000057d6e01 c008000003ac45c0 c0000000057e0000 0000000000000000
[ 3.078763] GPR28: c00000000bd10000 0000000000000000 c00000000bd003a8 c00000000bd00000
[ 3.079108] NIP [c0080000038f06dc] dc_destruct+0xe4/0x2d0 [amdgpu]
[ 3.079461] LR [c0080000038f2b08] dc_create+0x390/0x5b0 [amdgpu]
[ 3.079814] Call Trace:
[ 3.079822] [c0000000038d35c0] [c00000000bd10000] 0xc00000000bd10000 (unreliable)
[ 3.079852] [c0000000038d3620] [c0080000038f2b08] dc_create+0x390/0x5b0 [amdgpu]
[ 3.080192] [c0000000038d36c0] [c008000003872ee0] amdgpu_dm_init.isra.0+0x238/0x1f00 [amdgpu]
[ 3.080547] [c0000000038d3940] [c008000003874bd0] dm_hw_init+0x28/0x60 [amdgpu]
[ 3.080880] [c0000000038d3970] [c008000003539a08] amdgpu_device_init+0x1c50/0x23a0 [amdgpu]
[ 3.081203] [c0000000038d3ad0] [c00800000353b958] amdgpu_driver_load_kms+0x30/0x240 [amdgpu]
[ 3.081536] [c0000000038d3b50] [c008000003530890] amdgpu_pci_probe+0x1c8/0x540 [amdgpu]
[ 3.081858] [c0000000038d3be0] [c000000000adc748] local_pci_probe+0x68/0xe0
[ 3.081898] [c0000000038d3c60] [c000000000174888] work_for_cpu_fn+0x38/0x60
[ 3.081940] [c0000000038d3c90] [c00000000017a04c] process_one_work+0x2ac/0x570
[ 3.081990] [c0000000038d3d30] [c00000000017abf0] worker_thread+0x280/0x630
[ 3.082042] [c0000000038d3dc0] [c000000000186da4] kthread+0x124/0x130
[ 3.082089] [c0000000038d3e10] [c00000000000ce54] ret_from_kernel_thread+0x5c/0x64
[ 3.082153] Instruction dump:
[ 3.082199] 60420000 e93e0009 3bbd0001 2c290000 7fc3f378 41820010 4800cf2d 60000000
[ 3.082250] 895f03a8 7c1d5040 4180ffdc e93f0418 <e9490498> 810a002c 2c080000 41820094
[ 3.082300] ---[ end trace 0000000000000000 ]---
Hardware description:
- CPU: POWER9
- GPU: 0000:03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] [1002:731f] (rev c1)
- System Memory:
- Display(s):
- Type of Display Connection: HDMI
System information:
- Distro name and Version: Fedora Rawhide (fedora 37)
- Kernel version: 5.19.0-0.rc7.53.fc37.ppc64le
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
(1) Install 5.19-rc7 package from fedora, updating from 5.19-rc6 (2) reboot
Note that I was able to locate this commit that may be causing the issue: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.19-rc7&id=d11219ad53dcf61ced53ca60fe0c4a8d34393e6c