5.14.8: "Oops: Kernel access of bad area, sig:11" followed by lockup in dmesg during bootup (amdgpu/ppc64le)
Brief summary of the problem:
Hello! I think I found a problem with amdgpu on 5.14.8. I haven't tried any of the earlier releases yet. The dmesg log displays "Oops: Kernel access of bad area, sig: 11 [#1]
" during bootup. After that it displays a bit more information and then stops, requiring a hard power-off.
I've attached the dmesg log and that line appears on line number 868.
Hardware description:
- CPU: Sforza POWER9 8-Core (x2)
- Motherboard: Raptor Computing Systems Talos II (T2P9D01 Rev 1.01)
- GPU: POWERCOLOR AMD Vega Rx 56 "Red Dragon"
- System Memory: 64GB
System information:
- Distro name and Version: Gentoo (rolling release)
- Kernel version: 5.14.8 (little endian kernel)
- Page Size: 4k
- Custom kernel: No patches or customization beyond including the necessary drivers.
- Most recent working kernels I tested: 5.13.19, 5.10.69
- AMD package version: x11-drivers/xf86-video-amdgpu-21.0.0
How to reproduce the issue:
- Start the computer and select kernel 5.14.8
- Monitor status of the kernel using the Serial Console or via OpenBMC Console Client to obtain the kernel logs.
Workarounds:
- Hard power-off the computer and restart into Kernel 5.13.19 or 5.10.69.
Attached files:
- Dmesg log 5.14.8-dmesg.txt
Edited by Peter Easton