Kernel NULL pointer dereference while running deqp-vk on 6.6-rc3 with Ampere
I got this during a multithreaded deqp-vk run on Ampere. My exact card is a NVIDIA Corporation GA104 [GeForce RTX 3060] (rev a1)
. This happens a few minutes after channel recovery. I'm guessing this isn't a regression because cts has never been stable on this machine. On the userspace side I was using a custom NAK branch for this run.
Full kernel log is here. I think the relevant part is this:
Sep 30 13:31:56 m-kiwi kernel: BUG: kernel NULL pointer dereference, address: 0000000000000008
Sep 30 13:31:56 m-kiwi kernel: #PF: supervisor read access in kernel mode
Sep 30 13:31:56 m-kiwi kernel: #PF: error_code(0x0000) - not-present page
Sep 30 13:31:56 m-kiwi kernel: PGD 0 P4D 0
Sep 30 13:31:56 m-kiwi kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Sep 30 13:31:56 m-kiwi kernel: CPU: 8 PID: 85787 Comm: deqp-vk Not tainted 6.6.0-rc3-1-mainline #1 110f3f914d8dd9ab0dadb9147f4926e0a3ce2886
Sep 30 13:31:56 m-kiwi kernel: Hardware name: Micro-Star International Co., Ltd. MS-7D77/PRO B650M-A WIFI (MS-7D77), BIOS 1.40 11/08/2022
Sep 30 13:31:56 m-kiwi kernel: RIP: 0010:gp100_vmm_pgt_mem+0xbb/0x170 [nouveau]
Sep 30 13:31:56 m-kiwi kernel: Code: 8b 46 58 48 01 c2 48 09 c3 49 89 56 58 45 01 e5 41 0f b7 47 12 49 8b 7f 08 89 da 42 8d 2c e0 48 8b 47 08 41 83 c4 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7f 08 48 89 d9 48 8d 75 04 48 c1
Sep 30 13:31:56 m-kiwi kernel: RSP: 0018:ffffc900129ff698 EFLAGS: 00010206
Sep 30 13:31:56 m-kiwi kernel: RAX: 0000000000000000 RBX: 0000000000004001 RCX: 0000000000000001
Sep 30 13:31:56 m-kiwi kernel: RDX: 0000000000004001 RSI: 0000000000000210 RDI: ffff888115b0b180
Sep 30 13:31:56 m-kiwi kernel: RBP: 0000000000000210 R08: ffffc900129ff8f0 R09: 0000000000000001
Sep 30 13:31:56 m-kiwi kernel: R10: ffff888109efda20 R11: ffff8883770b6800 R12: 0000000000000003
Sep 30 13:31:56 m-kiwi kernel: R13: 0000000000000004 R14: ffffc900129ff8f0 R15: ffff888362b4d0c0
Sep 30 13:31:56 m-kiwi kernel: FS: 00007f6474cf3b80(0000) GS:ffff888808400000(0000) knlGS:0000000000000000
Sep 30 13:31:56 m-kiwi kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 30 13:31:56 m-kiwi kernel: CR2: 0000000000000008 CR3: 000000035b4ea000 CR4: 0000000000750ee0
Sep 30 13:31:56 m-kiwi kernel: PKRU: 55555554
Sep 30 13:31:56 m-kiwi kernel: Call Trace:
Sep 30 13:31:56 m-kiwi kernel: <TASK>
Sep 30 13:31:56 m-kiwi kernel: ? __die+0x23/0x70
Sep 30 13:31:56 m-kiwi kernel: ? page_fault_oops+0x171/0x4e0
Sep 30 13:31:56 m-kiwi kernel: ? nv04_timer_read+0x48/0x60 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? srso_alias_return_thunk+0x5/0x7f
Sep 30 13:31:56 m-kiwi kernel: ? exc_page_fault+0x7f/0x180
Sep 30 13:31:56 m-kiwi kernel: ? asm_exc_page_fault+0x26/0x30
Sep 30 13:31:56 m-kiwi kernel: ? gp100_vmm_pgt_mem+0xbb/0x170 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? gp100_vmm_pgt_mem+0x33/0x170 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_vmm_iter.isra.0+0x2f7/0x890 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_vmm_ptes_get_map+0x83/0xf0 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_vmm_map_locked+0x219/0x390 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? srso_alias_return_thunk+0x5/0x7f
Sep 30 13:31:56 m-kiwi kernel: ? nvkm_timer_wait_test+0x1e/0x90 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_vmm_map+0x89/0xe0 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_vram_map+0x5a/0x80 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_uvmm_mthd+0xc64/0x1070 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? srso_alias_return_thunk+0x5/0x7f
Sep 30 13:31:56 m-kiwi kernel: ? nvkm_uvmm_mthd+0x1fe/0x1070 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __x86_indirect_jump_thunk_r10+0x20/0x20
Sep 30 13:31:56 m-kiwi kernel: ? nvkm_ioctl+0x10b/0x250 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvkm_ioctl+0x10b/0x250 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvif_object_mthd+0xb4/0x200 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvif_vmm_map+0x12b/0x140 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nouveau_mem_map+0xa3/0xf0 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nouveau_vma_new+0x1ed/0x210 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nv84_fence_context_new+0xdc/0x120 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nvc0_fence_context_new+0x12/0x40 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nouveau_channel_new+0x202/0x530 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nouveau_abi16_ioctl_channel_alloc+0x165/0x450 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_nouveau_abi16_ioctl_channel_alloc+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: drm_ioctl_kernel+0xca/0x170
Sep 30 13:31:56 m-kiwi kernel: ? srso_alias_return_thunk+0x5/0x7f
Sep 30 13:31:56 m-kiwi kernel: drm_ioctl+0x26d/0x4b0
Sep 30 13:31:56 m-kiwi kernel: ? __pfx_nouveau_abi16_ioctl_channel_alloc+0x10/0x10 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau 34381a6d912b82209ed0cb7c2b51a2efa395fdb5]
Sep 30 13:31:56 m-kiwi kernel: __x64_sys_ioctl+0x94/0xd0
Sep 30 13:31:56 m-kiwi kernel: do_syscall_64+0x5d/0x90
Sep 30 13:31:56 m-kiwi kernel: ? srso_alias_return_thunk+0x5/0x7f
Sep 30 13:31:56 m-kiwi kernel: ? do_syscall_64+0x6c/0x90
Sep 30 13:31:56 m-kiwi kernel: ? do_syscall_64+0x6c/0x90
Sep 30 13:31:56 m-kiwi kernel: ? do_syscall_64+0x6c/0x90
Sep 30 13:31:56 m-kiwi kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Sep 30 13:31:56 m-kiwi kernel: RIP: 0033:0x7f647483d2ff
Sep 30 13:31:56 m-kiwi kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
Sep 30 13:31:56 m-kiwi kernel: RSP: 002b:00007ffc7dcd6e80 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Sep 30 13:31:56 m-kiwi kernel: RAX: ffffffffffffffda RBX: 000055610c6a5690 RCX: 00007f647483d2ff
Sep 30 13:31:56 m-kiwi kernel: RDX: 00007ffc7dcd6f60 RSI: 00000000c0586442 RDI: 0000000000000004
Sep 30 13:31:56 m-kiwi kernel: RBP: 00007ffc7dcd6f60 R08: 000000055610c69b R09: 00007f6474909ac0
Sep 30 13:31:56 m-kiwi kernel: R10: 000055610c6dc9c0 R11: 0000000000000246 R12: 00000000c0586442
Sep 30 13:31:56 m-kiwi kernel: R13: 0000000000000004 R14: 000055610c6a37c0 R15: 00007ffc7dcd7040
Sep 30 13:31:56 m-kiwi kernel: </TASK>
Sep 30 13:31:56 m-kiwi kernel: Modules linked in: nouveau mxm_wmi hid_logitech_hidpp intel_rapl_msr intel_rapl_common edac_mce_amd mt7921e mt7921_common mt792x_lib kvm_amd snd_hda_codec_realtek mt76_connac_lib snd_hda_codec_generic mt76 ledtrig_audio kvm snd_hda_codec_hdmi irqbypass crct10dif_pclmul snd_hda_intel crc32_pclmul snd_intel_dspcfg vfat polyval_clmulni snd_intel_sdw_acpi polyval_generic btusb gf128mul fat btrtl ghash_clmulni_intel snd_hda_codec sha512_ssse3 btintel mac80211 r8169 snd_hda_core aesni_intel btbcm snd_hwdep btmtk libarc4 crypto_simd bluetooth cryptd snd_pcm realtek hid_sony btrfs cfg80211 joydev mousedev ecdh_generic hid_logitech_dj ff_memless raid1 rapl wmi_bmof mdio_devres sp5100_tco snd_timer pcspkr ccp snd k10temp i2c_piix4 libphy blake2b_generic rfkill soundcore xor raid6_pq md_mod libcrc32c gpio_amdpt gpio_generic mac_hid crypto_user fuse dm_mod loop ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm drm_exec drm_suballoc_helper amdxcp drm_buddy gpu_sched
Sep 30 13:31:56 m-kiwi kernel: nvme drm_display_helper crc32c_intel nvme_core xhci_pci cec xhci_pci_renesas nvme_common video wmi
Sep 30 13:31:56 m-kiwi kernel: CR2: 0000000000000008
Sep 30 13:31:56 m-kiwi kernel: ---[ end trace 0000000000000000 ]---
Sep 30 13:31:56 m-kiwi kernel: RIP: 0010:gp100_vmm_pgt_mem+0xbb/0x170 [nouveau]
Sep 30 13:31:56 m-kiwi kernel: Code: 8b 46 58 48 01 c2 48 09 c3 49 89 56 58 45 01 e5 41 0f b7 47 12 49 8b 7f 08 89 da 42 8d 2c e0 48 8b 47 08 41 83 c4 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7f 08 48 89 d9 48 8d 75 04 48 c1
Sep 30 13:31:56 m-kiwi kernel: RSP: 0018:ffffc900129ff698 EFLAGS: 00010206
Sep 30 13:31:56 m-kiwi kernel: RAX: 0000000000000000 RBX: 0000000000004001 RCX: 0000000000000001
Sep 30 13:31:56 m-kiwi kernel: RDX: 0000000000004001 RSI: 0000000000000210 RDI: ffff888115b0b180
Sep 30 13:31:56 m-kiwi kernel: RBP: 0000000000000210 R08: ffffc900129ff8f0 R09: 0000000000000001
Sep 30 13:31:56 m-kiwi kernel: R10: ffff888109efda20 R11: ffff8883770b6800 R12: 0000000000000003
Sep 30 13:31:56 m-kiwi kernel: R13: 0000000000000004 R14: ffffc900129ff8f0 R15: ffff888362b4d0c0
Sep 30 13:31:56 m-kiwi kernel: FS: 00007f6474cf3b80(0000) GS:ffff888808400000(0000) knlGS:0000000000000000
Sep 30 13:31:56 m-kiwi kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 30 13:31:56 m-kiwi kernel: CR2: 0000000000000008 CR3: 000000035b4ea000 CR4: 0000000000750ee0
Sep 30 13:31:56 m-kiwi kernel: PKRU: 55555554
Sep 30 13:31:56 m-kiwi kernel: note: deqp-vk[85787] exited with irqs disabled