[Regression] 5.16.16 amdgpu causes FPU exception on ppc64le systems
The 5.16 kernel series seems to have introduced a regression in amdgpu, causing a FPU exception as follows:
[ 3.072146] Unrecoverable VMX/Altivec Unavailable Exception f20 at c008000001f3357c
[ 3.072170] Oops: Unrecoverable VMX/Altivec Unavailable Exception, sig: 6 [#1]
[ 3.072196] LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 3.072212] Modules linked in: amdgpu(+) gpu_sched i2c_algo_bit drm_ttm_helper ttm drm_kms_helper syscopyarea sd_mod sysfillrect sysimgblt xhci_pci fb_sys_fops xhci_hcd nvme tg3 usbcore drm nvme_core crc32c_vpmsum libphy t10_pi ahci crc_t10dif crct10dif_generic ptp crct10dif_vpmsum usb_common libahci pps_core crct10dif_common drm_panel_orientation_quirks
[ 3.072310] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.16.16 #1
[ 3.072336] Workqueue: events work_for_cpu_fn
[ 3.072361] NIP: c008000001f3357c LR: c008000001f344cc CTR: 0000000000000000
[ 3.072395] REGS: c000000001aa32b0 TRAP: 0f20 Not tainted (5.16.16)
[ 3.072428] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 84002220 XER: 00000004
[ 3.072472] CFAR: c008000001f344c8 IRQMASK: 0
[ 3.072472] GPR00: c008000001f344cc c000000001aa3550 c00800000224d000 0000000000000000
[ 3.072472] GPR04: c000000004e80000 c000000004c8c800 c000000004c8d000 c0ccc80404c8b0c0
[ 3.072472] GPR08: c0000003ff3a4328 0000000000000000 0000000000000000 c000000004c8cc00
[ 3.072472] GPR12: c000000000477530 c000000001777000 c00000000016b338 c000000003e66120
[ 3.072472] GPR16: c000000003e66128 c000000003e60000 c000000003e76ac0 c000000003e66138
[ 3.072472] GPR20: c000000003e66140 c000000003e66130 0000000000000100 0000000000000001
[ 3.072472] GPR24: 0000000000000001 c00800000225ba38 0000000000000001 c000000003e75bc1
[ 3.072472] GPR28: c000000004e90000 c000000001aa37b0 c000000004e80000 c000000004c8c800
[ 3.072726] NIP [c008000001f3357c] dcn20_resource_construct+0x44/0xf30 [amdgpu]
[ 3.073010] LR [c008000001f344cc] dcn20_create_resource_pool+0x64/0x100 [amdgpu]
[ 3.073291] Call Trace:
[ 3.073299] [c000000001aa3550] [c008000001f344b0] dcn20_create_resource_pool+0x48/0x100 [amdgpu] (unreliable)
[ 3.073585] [c000000001aa35d0] [c00800000205d7f0] dc_create_resource_pool+0x2f8/0x3a0 [amdgpu]
[ 3.073860] [c000000001aa3600] [c00800000204e694] dc_create+0x1cc/0x680 [amdgpu]
[ 3.074129] [c000000001aa36b0] [c008000001eb93c4] amdgpu_dm_init.isra.0+0x1ec/0x1dd0 [amdgpu]
[ 3.074404] [c000000001aa3910] [c008000001ebafd0] dm_hw_init+0x28/0x60 [amdgpu]
[ 3.074672] [c000000001aa3940] [c008000001c0ac9c] amdgpu_device_init+0x1d34/0x21b0 [amdgpu]
[ 3.074899] [c000000001aa3a90] [c008000001c0c3b0] amdgpu_driver_load_kms+0x48/0x370 [amdgpu]
[ 3.075118] [c000000001aa3b10] [c008000001c01e6c] amdgpu_pci_probe+0x264/0x400 [amdgpu]
[ 3.075336] [c000000001aa3bc0] [c000000000802688] local_pci_probe+0x68/0x110
[ 3.075365] [c000000001aa3c40] [c000000000158e98] work_for_cpu_fn+0x38/0x60
[ 3.075391] [c000000001aa3c70] [c00000000015ec28] process_one_work+0x2a8/0x590
[ 3.075428] [c000000001aa3d10] [c00000000015f810] worker_thread+0x2a0/0x610
[ 3.075463] [c000000001aa3da0] [c00000000016b4e8] kthread+0x1b8/0x1c0
[ 3.075488] [c000000001aa3e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
[ 3.075507] Instruction dump:
[ 3.075517] fb61ffd8 fbc1fff0 fbe1fff8 fa81ffa0 faa1ffa8 fac1ffb0 fae1ffb8 fb01ffc0
[ 3.075549] fb21ffc8 fb41ffd0 fb81ffe0 fba1ffe8 <100004c4> 3920ff90 3940ff80 7c9f2378
[ 3.075583] ---[ end trace d9884fa0e00b9a00 ]---
Opening the bug here to track. May be related to #1215 (closed).
Originally reported here: https://forums.raptorcs.com/index.php/topic,332.msg2611.html