oops in dpu_plane_atomic_print_state() at boot on sc7180 Trogdor.Lazor w/ v6.13-rc1
The following oops is seen at boot on Trogdor.Lazor with the v6.13-rc1 kernel.
[ 1.911710] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
[ 1.911737] Mem abort info:
[ 1.911742] ESR = 0x0000000096000004
[ 1.911748] EC = 0x25: DABT (current EL), IL = 32 bits
[ 1.911755] SET = 0, FnV = 0
[ 1.911760] EA = 0, S1PTW = 0
[ 1.911765] FSC = 0x04: level 0 translation fault
[ 1.911771] Data abort info:
[ 1.911776] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 1.911782] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 1.911788] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 1.911795] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000108fe5000
[ 1.911801] [0000000000000020] pgd=0000000000000000, p4d=0000000000000000
[ 1.911815] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
[ 1.911825] Modules linked in: joydev
[ 1.911839] CPU: 2 UID: 0 PID: 422 Comm: grep Tainted: G W 6.13.0-rc1-g40384c840ea1 #1 331783aba8099af8c108726948e6d79d6909356d
[ 1.911852] Tainted: [W]=WARN
[ 1.911857] Hardware name: Google Lazor (rev3 - 8) with KB Backlight (DT)
[ 1.911864] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 1.911873] pc : dpu_plane_atomic_print_state+0x48/0x1e8
[ 1.911889] lr : dpu_plane_atomic_print_state+0x38/0x1e8
[ 1.911899] sp : ffff800084f63b40
[ 1.911904] pmr: 000000e0
[ 1.911909] x29: ffff800084f63b40 x28: 0000000000000000 x27: 0000000000000000
[ 1.911923] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 1.911936] x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000
[ 1.911948] x20: ffff0cc3c28e5a00 x19: ffff800084f63c50 x18: 0000000000000000
[ 1.911961] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[ 1.911973] x14: 0000000000000000 x13: 0000000000000064 x12: ffff0000ffffff00
[ 1.911986] x11: 0000000000000000 x10: 00000000000000e7 x9 : 15152b1a38b0fd00
[ 1.911998] x8 : 0000000000000000 x7 : 6e6168635f746d67 x6 : 000000000000000a
[ 1.912010] x5 : ffff0cc3c0b710e7 x4 : ffffa172e896a43d x3 : ffff0a01ffffff10
[ 1.912023] x2 : 0000000000000001 x1 : ffffa172e894e50c x0 : ffff800084f63c50
[ 1.912036] Call trace:
[ 1.912043] dpu_plane_atomic_print_state+0x48/0x1e8 (P)
[ 1.912055] dpu_plane_atomic_print_state+0x38/0x1e8 (L)
[ 1.912065] drm_atomic_plane_print_state+0x1d4/0x200
[ 1.912075] __drm_state_dump+0x84/0x1f8
[ 1.912085] drm_state_info+0x64/0x98
[ 1.912095] seq_read+0x12c/0x4a8
[ 1.912105] full_proxy_read+0x74/0xb0
[ 1.912116] __arm64_sys_read+0x150/0x330
[ 1.912126] invoke_syscall+0x4c/0xf0
[ 1.912137] do_el0_svc+0x98/0xf8
[ 1.912147] el0_svc+0x38/0x68
[ 1.912158] el0t_64_sync_handler+0x20/0x128
[ 1.912168] el0t_64_sync+0x1b0/0x1b8
[ 1.912180] Code: f9405e88 90006b21 91143021 aa1303e0 (f9401102)
[ 1.912186] ---[ end trace 0000000000000000 ]---
[ 1.919490] Kernel panic - not syncing: Oops: Fatal exception
[ 1.919512] SMP: stopping secondary CPUs
[ 1.920085] Kernel Offset: 0x217267600000 from 0xffff800080000000
[ 1.920095] PHYS_OFFSET: 0xfff0f33d40000000
[ 1.920102] CPU features: 0x080,00006150,00901253,8200720b
[ 1.920112] Memory Limit: none
The PC corresponds to this line in dpu_plane_atomic_print_state()
drm_printf(p, "\tsspp[0]=%s\n", pipe->sspp->cap->name);
I bisected the kernel and saw that it first happens at commit 31f7148f ("drm/msm/dpu: move pstate->pipe initialization to dpu_plane_atomic_check"). Probably we're dumping the state from debugfs before an atomic check has happened and set pipe->sspp to something.