[Navi] Unrecoverable FP Unavailable Exception on 5.6.x
On POWER9 (ppc64le
) and Navi RX 5700 XT, I'm getting the following crash in the display core on 5.6.x:
[ 3.069342] Unrecoverable FP Unavailable Exception 800 at c00800000156311c
[ 3.069370] Oops: Unrecoverable FP Unavailable Exception, sig: 6 [#1]
[ 3.069395] LE PAGE_SIZE=4K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
[ 3.069409] Modules linked in: raid6_pq(+) libcrc32c cuse fuse kvm_hv kvm ext4 crc32c_generic crc16 mbcache jbd2 usbmouse hid_generic usbhid hid sd_mod amdgpu ast gpu_sched drm_vram_helper drm_ttm_helper i2c_algo_bit ttm drm_kms_helper ahci syscopyarea sysfillrect sysimgblt xhci_pci fb_sys_fops cec libahci xhci_hcd rc_core libata drm usbcore crc32c_vpmsum scsi_mod drm_panel_orientation_quirks agpgart
[ 3.069554] CPU: 23 PID: 310 Comm: kworker/23:1 Not tainted 5.6.7_1 #1
[ 3.069634] Workqueue: events dm_irq_work_func [amdgpu]
[ 3.069648] NIP: c00800000156311c LR: c008000001627c14 CTR: c0080000015630f8
[ 3.069674] REGS: c00000000547b2d0 TRAP: 0800 Not tainted (5.6.7_1)
[ 3.069696] MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 24002282 XER: 00000000
[ 3.069738] CFAR: c008000001627c10 IRQMASK: 0
GPR00: c008000001627c14 c00000000547b560 c008000001838e00 c00000078c000000
GPR04: c000000003780000 0000000000000000 0000000000000001 0000000100000000
GPR08: 0000000100000000 c00800000184b518 7ffffffe7fffffff 0000000000000000
GPR12: c0080000015630f8 c0000007ff72fe00 c00000000016bad8 c0000000059d7800
GPR16: 0000000000000000 c000000005475000 0000000000000000 c000000004336400
GPR20: c000000003780000 0000000000000000 ffffffffcccccccd 0000000000000004
GPR24: 0000000000000000 c000000791015080 c00000078c000000 c00000078f109f80
GPR28: c00000078c000000 c000000003780000 0000000000000000 0000000000000001
[ 3.069987] NIP [c00800000156311c] dcn20_validate_bandwidth+0x24/0x270 [amdgpu]
[ 3.070048] LR [c008000001627c14] dc_validate_global_state+0x3fc/0x420 [amdgpu]
[ 3.070077] Call Trace:
[ 3.070129] [c00000000547b560] [c008000001627b00] dc_validate_global_state+0x2e8/0x420 [amdgpu] (unreliable)
[ 3.070230] [c00000000547b5f0] [c0080000014fe1ec] amdgpu_dm_atomic_check+0x15b4/0x1700 [amdgpu]
[ 3.070273] [c00000000547b750] [c0080000005b7be4] drm_atomic_check_only+0x64c/0x910 [drm]
[ 3.070318] [c00000000547b880] [c0080000005b7ed8] drm_atomic_commit+0x30/0xa0 [drm]
[ 3.070362] [c00000000547b8f0] [c0080000005d45b4] drm_client_modeset_commit_atomic+0x1dc/0x300 [drm]
[ 3.070418] [c00000000547b9b0] [c0080000005d4754] drm_client_modeset_commit_force+0x7c/0x240 [drm]
[ 3.070462] [c00000000547ba00] [c008000000366b14] drm_fb_helper_restore_fbdev_mode_unlocked+0x9c/0x150 [drm_kms_helper]
[ 3.070505] [c00000000547ba50] [c008000000366c14] drm_fb_helper_set_par+0x4c/0xa0 [drm_kms_helper]
[ 3.070536] [c00000000547bac0] [c008000000366d5c] drm_fb_helper_hotplug_event.part.0+0xf4/0x140 [drm_kms_helper]
[ 3.070570] [c00000000547bb30] [c00800000034c864] drm_kms_helper_hotplug_event+0x4c/0x80 [drm_kms_helper]
[ 3.070658] [c00000000547bb60] [c008000001502bfc] handle_hpd_irq+0x134/0x190 [amdgpu]
[ 3.070755] [c00000000547bbf0] [c008000001502cec] dm_irq_work_func+0x94/0xd0 [amdgpu]
[ 3.070781] [c00000000547bc70] [c000000000161d34] process_one_work+0x264/0x520
[ 3.070819] [c00000000547bd10] [c000000000162088] worker_thread+0x98/0x5b0
[ 3.070863] [c00000000547bdb0] [c00000000016bc18] kthread+0x148/0x1a0
[ 3.070899] [c00000000547be20] [c00000000000b648] ret_from_kernel_thread+0x5c/0x74
[ 3.070944] Instruction dump:
[ 3.070964] 60000000 002d5d08 00000000 3c4c002d 38425d08 7c0802a6 60000000 7c0802a6
[ 3.071021] fb81ffd8 fba1ffe0 fbc1ffe8 fbe1fff0 <dbe1fff8> 3d220000 7c7d1b78 7c9f2378
[ 3.071073] ---[ end trace c36f353f77c5bb6d ]---
This does not happen on 5.5.x nor 5.4.x (with the necessary FP patches, which weren't merged back into those series).
From the looks of it, I suspect that DC_FP_START()
/DC_FP_END()
is missing somewhere.