"kernel NULL pointer dereference, address: 0000000000000010" / "0010:amdgpu_vm_bo_update+0x637/0x760 [amdgpu]" when starting applications
System information
inxi -GSC -xx:
System:
Host: excalibur Kernel: 6.11.8-1-default arch: x86_64 bits: 64 compiler: gcc
v: 14.2.1
Desktop: KDE Plasma v: 6.2.4 tk: Qt v: N/A wm: kwin_wayland dm: SDDM
Distro: openSUSE Tumbleweed 20241203
CPU:
Info: 8-core model: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics bits: 64
type: MT MCP arch: Zen 4 rev: 1 cache: L1: 512 KiB L2: 8 MiB L3: 16 MiB
Speed (MHz): avg: 1098 min/max: 400/5137 boost: enabled cores: 1: 1098
2: 1098 3: 1098 4: 1098 5: 1098 6: 1098 7: 1098 8: 1098 9: 1098 10: 1098
11: 1098 12: 1098 13: 1098 14: 1098 15: 1098 16: 1098 bogomips: 121424
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
Graphics:
Device-1: Advanced Micro Devices [AMD/ATI] Phoenix1 driver: amdgpu v: kernel
arch: RDNA-3 pcie: speed: 16 GT/s lanes: 16 ports: active: eDP-1 empty: DP-1,
DP-2, DP-3, DP-4, DP-5, DP-6, HDMI-A-1, Writeback-1 bus-ID: 03:00.0
chip-ID: 1002:15bf temp: 39.0 C
Device-2: DX-231115-J HD WebCam driver: uvcvideo type: USB rev: 2.0
speed: 480 Mb/s lanes: 1 bus-ID: 1-4:4 chip-ID: 927b:bc13
Display: wayland server: X.org v: 1.21.1.14 with: Xwayland v: 24.1.4
compositor: kwin_wayland driver: X: loaded: modesetting unloaded: vesa
alternate: fbdev dri: radeonsi gpu: amdgpu display-ID: 0
Monitor-1: eDP-1 res: 1707x1067 size: N/A
API: EGL v: 1.5 platforms: device: 0 drv: radeonsi device: 1 drv: swrast
gbm: drv: kms_swrast surfaceless: drv: radeonsi wayland: drv: radeonsi x11:
drv: radeonsi
API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.3.0 glx-v: 1.4
direct-render: yes renderer: AMD Radeon 780M (radeonsi gfx1103_r1 LLVM
19.1.4 DRM 3.59 6.11.8-1-default) device-ID: 1002:15bf display-ID: :1.0
API: Vulkan v: 1.3.296 surfaces: xcb,xlib,wayland device: 0
type: integrated-gpu driver: N/A device-ID: 1002:15bf
- OS: Linux (openSUSE Tumbleweed 20241203)
- GPU: AMD Radeon 780M (03:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Phoenix1 [1002:15bf] (rev c2))
- Kernel version: 6.11.8-1-default
- Mesa version: git
- Wayland version: 1.23.1
- XWayland version: 24.1.4
- Desktop manager and compositor: KDE Plasma/KWin 6.2.4
Describe the issue
When I try to start some applications, for example "glxinfo -B", spectacle, Callibre E-Book Viewer, I get bug messages like this in the dmesg:
[čet dec 5 02:29:14 2024] [ T80204] BUG: kernel NULL pointer dereference, address: 0000000000000010
[čet dec 5 02:29:14 2024] [ T80204] #PF: supervisor read access in kernel mode
[čet dec 5 02:29:14 2024] [ T80204] #PF: error_code(0x0000) - not-present page
[čet dec 5 02:29:14 2024] [ T80204] PGD 0 P4D 0
[čet dec 5 02:29:14 2024] [ T80204] Oops: Oops: 0000 [#16] PREEMPT SMP NOPTI
[čet dec 5 02:29:14 2024] [ T80204] CPU: 11 UID: 1000 PID: 80204 Comm: glxinfo:cs0 Tainted: G D 6.11.8-1-default #1 openSUSE Tumbleweed 1400000003000000474e55001983fa4dd342f6a5
[čet dec 5 02:29:14 2024] [ T80204] Tainted: [D]=DIE
[čet dec 5 02:29:14 2024] [ T80204] Hardware name: SLIMBOOK EXCALIBUR-16-AMD7 /EXCALIBUR-16-AMD7 , BIOS SLIMH6.03 05/29/2024
[čet dec 5 02:29:14 2024] [ T80204] RIP: 0010:amdgpu_vm_bo_update+0x637/0x760 [amdgpu]
[čet dec 5 02:29:14 2024] [ T80204] Code: 00 00 00 49 83 ce 08 48 c7 44 24 10 00 00 00 00 41 89 c7 48 8b 92 b8 f7 ff ff 41 83 e7 01 e9 cf fb ff ff 49 8b 91 c8 01 00 00 <8b> 52 10 83 fa 07 77 0c 41 8b 31 23 34 95 60 89 35 c1 75 4e 48 89
[čet dec 5 02:29:14 2024] [ T80204] RSP: 0018:ffffa56cc8eab950 EFLAGS: 00010246
[čet dec 5 02:29:14 2024] [ T80204] RAX: ffff93b92cc568c0 RBX: ffff93b7cd680000 RCX: ffff93b92cc56880
[čet dec 5 02:29:14 2024] [ T80204] RDX: 0000000000000000 RSI: ffff93b7d07a8148 RDI: ffff93b92cc568c0
[čet dec 5 02:29:14 2024] [ T80204] RBP: ffff93b92cc568b8 R08: 0000000000000000 R09: ffff93ba59dc6400
[čet dec 5 02:29:14 2024] [ T80204] R10: 0000000080200018 R11: ffff93b92cc568c0 R12: ffff93b92cc568d0
[čet dec 5 02:29:14 2024] [ T80204] R13: ffff93b7faf89000 R14: 0003000000000070 R15: 0000000000000000
[čet dec 5 02:29:14 2024] [ T80204] FS: 00007f4e0d3ab6c0(0000) GS:ffff93bebe780000(0000) knlGS:0000000000000000
[čet dec 5 02:29:14 2024] [ T80204] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[čet dec 5 02:29:14 2024] [ T80204] CR2: 0000000000000010 CR3: 0000000104566000 CR4: 0000000000f50ef0
[čet dec 5 02:29:14 2024] [ T80204] PKRU: 55555554
[čet dec 5 02:29:14 2024] [ T80204] Call Trace:
[čet dec 5 02:29:14 2024] [ T80204] <TASK>
[čet dec 5 02:29:14 2024] [ T80204] ? __die_body.cold+0x19/0x26
[čet dec 5 02:29:14 2024] [ T80204] ? page_fault_oops+0x132/0x2a0
[čet dec 5 02:29:14 2024] [ T80204] ? exc_page_fault+0x73/0x170
[čet dec 5 02:29:14 2024] [ T80204] ? asm_exc_page_fault+0x26/0x30
[čet dec 5 02:29:14 2024] [ T80204] ? amdgpu_vm_bo_update+0x637/0x760 [amdgpu 1400000003000000474e55002a5d07a34a8d2172]
[čet dec 5 02:29:14 2024] [ T80204] amdgpu_cs_ioctl+0x112f/0x1990 [amdgpu 1400000003000000474e55002a5d07a34a8d2172]
[čet dec 5 02:29:14 2024] [ T80204] ? unmap_region.constprop.0+0x11f/0x150
[čet dec 5 02:29:14 2024] [ T80204] ? rcutree_enqueue+0x20/0x120
[čet dec 5 02:29:14 2024] [ T80204] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu 1400000003000000474e55002a5d07a34a8d2172]
[čet dec 5 02:29:14 2024] [ T80204] drm_ioctl_kernel+0xa8/0x100
[čet dec 5 02:29:14 2024] [ T80204] drm_ioctl+0x266/0x4f0
[čet dec 5 02:29:14 2024] [ T80204] ? __pfx_amdgpu_cs_ioctl+0x10/0x10 [amdgpu 1400000003000000474e55002a5d07a34a8d2172]
[čet dec 5 02:29:14 2024] [ T80204] ? futex_wake+0x8f/0x1b0
[čet dec 5 02:29:14 2024] [ T80204] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 1400000003000000474e55002a5d07a34a8d2172]
[čet dec 5 02:29:14 2024] [ T80204] __x64_sys_ioctl+0x94/0xd0
[čet dec 5 02:29:14 2024] [ T80204] do_syscall_64+0x82/0x160
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? do_syscall_64+0x8e/0x160
[čet dec 5 02:29:14 2024] [ T80204] ? do_mprotect_pkey+0x148/0x580
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? __rseq_handle_notify_resume+0xa6/0x4c0
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? switch_fpu_return+0x4e/0xd0
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? syscall_exit_to_user_mode+0x160/0x220
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? do_syscall_64+0x8e/0x160
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? do_user_addr_fault+0x36c/0x620
[čet dec 5 02:29:14 2024] [ T80204] ? srso_alias_return_thunk+0x5/0xfbef5
[čet dec 5 02:29:14 2024] [ T80204] ? exc_page_fault+0x73/0x170
[čet dec 5 02:29:14 2024] [ T80204] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[čet dec 5 02:29:14 2024] [ T80204] RIP: 0033:0x7f4e1731890f
[čet dec 5 02:29:14 2024] [ T80204] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[čet dec 5 02:29:14 2024] [ T80204] RSP: 002b:00007f4e0d3aa920 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[čet dec 5 02:29:14 2024] [ T80204] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f4e1731890f
[čet dec 5 02:29:14 2024] [ T80204] RDX: 00007f4e0d3aa9b0 RSI: 00000000c0186444 RDI: 0000000000000006
[čet dec 5 02:29:14 2024] [ T80204] RBP: 00007f4e0d3aa9f0 R08: 00007f4e0d3aaa70 R09: 00007f4e0d3aa980
[čet dec 5 02:29:14 2024] [ T80204] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000c0186444
[čet dec 5 02:29:14 2024] [ T80204] R13: 00007f4e0d3aa9b0 R14: 00007f4e0d3aabc8 R15: 00007f4e0d3aaa30
[čet dec 5 02:29:14 2024] [ T80204] </TASK>
[čet dec 5 02:29:14 2024] [ T80204] Modules linked in: ccm rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device af_packet qrtr nf_tables iptable_filter nvme_fabrics nvme_keyring cmac algif_hash algif_skcipher af_alg bnep nls_iso8859_1 nls_cp437 vfat fat iwlmvm mac80211 libarc4 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel amd_atl intel_rapl_msr snd_intel_dspcfg uvcvideo intel_rapl_common snd_intel_sdw_acpi videobuf2_vmalloc btusb uvc snd_hda_codec btrtl videobuf2_memops edac_mce_amd btintel videobuf2_v4l2 snd_hda_core btbcm iwlwifi videodev snd_hwdep btmtk videobuf2_common snd_pcm kvm_amd spd5118 snd_timer mc bluetooth tiny_power_button cfg80211 snd kvm i2c_piix4 soundcore rfkill pcspkr i2c_smbus k10temp thermal joydev button wireless_hotkey amd_pmc ac loop fuse efi_pstore dm_mod configfs nfnetlink dmi_sysfs ip_tables x_tables crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 xhci_pci xhci_pci_renesas nvme
[čet dec 5 02:29:14 2024] [ T80204] hid_multitouch xhci_hcd nvme_core hid_generic aesni_intel gf128mul crypto_simd cryptd usbcore ccp sp5100_tco nvme_auth battery i2c_hid_acpi i2c_hid serio_raw amdgpu video wmi amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec gpu_sched drm_suballoc_helper drm_buddy drm_display_helper cec rc_core btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq sbs sbshc msr i2c_dev efivarfs
[čet dec 5 02:29:14 2024] [ T80204] CR2: 0000000000000010
[čet dec 5 02:29:14 2024] [ T80204] ---[ end trace 0000000000000000 ]---
[čet dec 5 02:29:14 2024] [ T80204] RIP: 0010:amdgpu_vm_bo_update+0x637/0x760 [amdgpu]
[čet dec 5 02:29:14 2024] [ T80204] Code: 00 00 00 49 83 ce 08 48 c7 44 24 10 00 00 00 00 41 89 c7 48 8b 92 b8 f7 ff ff 41 83 e7 01 e9 cf fb ff ff 49 8b 91 c8 01 00 00 <8b> 52 10 83 fa 07 77 0c 41 8b 31 23 34 95 60 89 35 c1 75 4e 48 89
[čet dec 5 02:29:14 2024] [ T80204] RSP: 0018:ffffa56cc4d0f740 EFLAGS: 00010246
[čet dec 5 02:29:14 2024] [ T80204] RAX: ffff93bc55f9b840 RBX: ffff93b7cd680000 RCX: ffff93bc55f9b800
[čet dec 5 02:29:14 2024] [ T80204] RDX: 0000000000000000 RSI: ffff93b8b93f8148 RDI: ffff93bc55f9b840
[čet dec 5 02:29:14 2024] [ T80204] RBP: ffff93bc55f9b838 R08: 0000000000000000 R09: ffff93b89f032000
[čet dec 5 02:29:14 2024] [ T80204] R10: 0000000000001000 R11: ffff93bc55f9b840 R12: ffff93bc55f9b850
[čet dec 5 02:29:14 2024] [ T80204] R13: ffff93bc783ac000 R14: 0003000000000070 R15: 0000000000000000
[čet dec 5 02:29:14 2024] [ T80204] FS: 00007f4e0d3ab6c0(0000) GS:ffff93bebe780000(0000) knlGS:0000000000000000
[čet dec 5 02:29:14 2024] [ T80204] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[čet dec 5 02:29:14 2024] [ T80204] CR2: 0000000000000010 CR3: 0000000104566000 CR4: 0000000000f50ef0
[čet dec 5 02:29:14 2024] [ T80204] PKRU: 55555554
[čet dec 5 02:29:14 2024] [ T80204] note: glxinfo:cs0[80204] exited with irqs disabled
This is the dmesg.txt
Regression
I bisected the Mesa commits and this one is the first bad commit: 8c91624614c1f939974fe0d2d1a3baf83335cecb: winsys/amdgpu: use VM_ALWAYS_VALID for all VRAM and GTT allocations by @yogeshmohan
Edited by Jure Repinc