The screen froze when amdgpu started with a null pointer dereference booting a 6.12 merge window kernel
Brief summary of the problem:
I booted the Fedora Rawhide KDE live image Fedora-KDE-Live-x86_64-Rawhide-20240922.n.0.iso on an hp laptop with a AMD A10-9620P CPU and Radeon R5 integrated GPU. The screen froze when amdgpu started with the 6.12 merge window kernel 6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42.x86_64. When I booted with rhgb and quiet removed from the kernel command line, I could see the final messages were in amdgpu startup. The Plasma startup sound played so the boot must've continued even though the screen didn't change. I installed 6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42 in my Fedora 41 KDE installation in order to get a kernel log. A null pointer dereference happened when amdgpu started with the following trace in the journal.
Sep 23 01:53:02 kernel: [drm] amdgpu kernel modesetting enabled.
Sep 23 01:53:02 kernel: amdgpu: Virtual CRAT table created for CPU
Sep 23 01:53:02 kernel: amdgpu: Topology: Add CPU node
Sep 23 01:53:02 kernel: [drm] initializing kernel modesetting (CARRIZO 0x1002:0x9874 0x103C:0x8332 0xCA).
Sep 23 01:53:02 kernel: [drm] register mmio base: 0xF0400000
Sep 23 01:53:02 kernel: [drm] register mmio size: 262144
Sep 23 01:53:02 kernel: [drm] add ip block number 0 <vi_common>
Sep 23 01:53:02 kernel: [drm] add ip block number 1 <gmc_v8_0>
Sep 23 01:53:02 kernel: [drm] add ip block number 2 <cz_ih>
Sep 23 01:53:02 kernel: [drm] add ip block number 3 <gfx_v8_0>
Sep 23 01:53:02 kernel: [drm] add ip block number 4 <sdma_v3_0>
Sep 23 01:53:02 kernel: [drm] add ip block number 5 <powerplay>
Sep 23 01:53:02 kernel: [drm] add ip block number 6 <dm>
Sep 23 01:53:02 kernel: [drm] add ip block number 7 <uvd_v6_0>
Sep 23 01:53:02 kernel: [drm] add ip block number 8 <vce_v3_0>
Sep 23 01:53:02 kernel: [drm] add ip block number 9 <acp_ip>
Sep 23 01:53:02 kernel: amdgpu 0000:00:01.0: amdgpu: Fetched VBIOS from VFCT
Sep 23 01:53:02 kernel: amdgpu: ATOM BIOS: 113-C75100-031
Sep 23 01:53:02 kernel: [drm] UVD is enabled in physical mode
Sep 23 01:53:02 kernel: [drm] VCE enabled in physical mode
Sep 23 01:53:02 kernel: Console: switching to colour dummy device 80x25
Sep 23 01:53:02 kernel: amdgpu 0000:00:01.0: vgaarb: deactivate vga console
Sep 23 01:53:02 kernel: amdgpu 0000:00:01.0: amdgpu: Trusted Memory Zone (TMZ) feature not supported
Sep 23 01:53:02 kernel: [drm] vm size is 64 GB, 2 levels, block size is 10-bit, fragment size is 9-bit
Sep 23 01:53:02 kernel: amdgpu 0000:00:01.0: amdgpu: VRAM: 512M 0x000000F400000000 - 0x000000F41FFFFFFF (512M used)
Sep 23 01:53:02 kernel: amdgpu 0000:00:01.0: amdgpu: GART: 1024M 0x000000FF00000000 - 0x000000FF3FFFFFFF
Sep 23 01:53:02 kernel: [drm] Detected VRAM RAM=512M, BAR=512M
Sep 23 01:53:02 kernel: [drm] RAM width 64bits UNKNOWN
Sep 23 01:53:02 kernel: BUG: kernel NULL pointer dereference, address: 00000000000000a0
Sep 23 01:53:02 kernel: #PF: supervisor read access in kernel mode
Sep 23 01:53:02 kernel: #PF: error_code(0x0000) - not-present page
Sep 23 01:53:02 kernel: PGD 0 P4D 0
Sep 23 01:53:02 kernel: Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
Sep 23 01:53:02 kernel: CPU: 1 UID: 0 PID: 407 Comm: (udev-worker) Not tainted 6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42.x86_64 #1
Sep 23 01:53:02 kernel: Hardware name: HP HP Laptop 15-bw0xx/8332, BIOS F.52 12/03/2019
Sep 23 01:53:02 kernel: RIP: 0010:dma_get_required_mask+0x15/0x50
Sep 23 01:53:02 kernel: Code: 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 38 02 00 00 48 85 c0 74 12 <48> 8b 80 a0 00 00 00 48 85 c0 74 20 e9 ba cf 00 01 cc 48 8b 05 3a
Sep 23 01:53:02 kernel: RSP: 0018:ffffacad8052f8c0 EFLAGS: 00010202
Sep 23 01:53:02 kernel: RAX: 0000000000000000 RBX: 000000ffffffffff RCX: 0000000000000027
Sep 23 01:53:02 kernel: RDX: 0000000000000000 RSI: ffffffffc17a4e03 RDI: ffff9d66816010c8
Sep 23 01:53:02 kernel: RBP: ffff9d66816010c8 R08: 0000000000000000 R09: 0000000000000000
Sep 23 01:53:02 kernel: R10: 0720072007200720 R11: 0720072007200720 R12: 0000000000000000
Sep 23 01:53:02 kernel: R13: ffff9d66816010c8 R14: 0000000000000000 R15: ffff9d6695480000
Sep 23 01:53:02 kernel: FS: 00007f609ec0a980(0000) GS:ffff9d6777480000(0000) knlGS:0000000000000000
Sep 23 01:53:02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 23 01:53:02 kernel: CR2: 00000000000000a0 CR3: 00000001022aa000 CR4: 00000000001506f0
Sep 23 01:53:02 kernel: Call Trace:
Sep 23 01:53:02 kernel: <TASK>
Sep 23 01:53:02 kernel: ? __die_body.cold+0x19/0x27
Sep 23 01:53:02 kernel: ? page_fault_oops+0x15a/0x2f0
Sep 23 01:53:02 kernel: ? search_module_extables+0x19/0x60
Sep 23 01:53:02 kernel: ? search_bpf_extables+0x5f/0x80
Sep 23 01:53:02 kernel: ? exc_page_fault+0x7e/0x180
Sep 23 01:53:02 kernel: ? asm_exc_page_fault+0x26/0x30
Sep 23 01:53:02 kernel: ? dma_get_required_mask+0x15/0x50
Sep 23 01:53:02 kernel: dma_addressing_limited+0x62/0x90
Sep 23 01:53:02 kernel: amdgpu_ttm_init+0x3d/0x5a0 [amdgpu]
Sep 23 01:53:02 kernel: gmc_v8_0_sw_init+0x2c9/0x6c0 [amdgpu]
Sep 23 01:53:02 kernel: amdgpu_device_init.cold+0x1a28/0x2019 [amdgpu]
Sep 23 01:53:02 kernel: ? pci_bus_read_config_word+0x4d/0x90
Sep 23 01:53:02 kernel: amdgpu_driver_load_kms+0x19/0x70 [amdgpu]
Sep 23 01:53:02 kernel: amdgpu_pci_probe+0x1b6/0x4c0 [amdgpu]
Sep 23 01:53:02 kernel: local_pci_probe+0x45/0x90
Sep 23 01:53:02 kernel: pci_device_probe+0xc1/0x2a0
Sep 23 01:53:02 kernel: really_probe+0xde/0x340
Sep 23 01:53:02 kernel: ? pm_runtime_barrier+0x54/0x90
Sep 23 01:53:02 kernel: ? __pfx___driver_attach+0x10/0x10
Sep 23 01:53:02 kernel: __driver_probe_device+0x78/0x110
Sep 23 01:53:02 kernel: driver_probe_device+0x1f/0xa0
Sep 23 01:53:02 kernel: __driver_attach+0xba/0x1c0
Sep 23 01:53:02 kernel: bus_for_each_dev+0x8f/0xe0
Sep 23 01:53:02 kernel: bus_add_driver+0x142/0x220
Sep 23 01:53:02 kernel: driver_register+0x72/0xd0
Sep 23 01:53:02 kernel: ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
Sep 23 01:53:02 kernel: do_one_initcall+0x5b/0x310
Sep 23 01:53:02 kernel: do_init_module+0x90/0x260
Sep 23 01:53:02 kernel: __do_sys_init_module+0x17a/0x1b0
Sep 23 01:53:02 kernel: do_syscall_64+0x82/0x160
Sep 23 01:53:02 kernel: ? set_ptes.isra.0+0x41/0x90
Sep 23 01:53:02 kernel: ? do_anonymous_page+0xfc/0x8e0
Sep 23 01:53:02 kernel: ? __pte_offset_map+0x1b/0x180
Sep 23 01:53:02 kernel: ? __handle_mm_fault+0xbff/0x1040
Sep 23 01:53:02 kernel: ? __count_memcg_events+0x75/0x130
Sep 23 01:53:02 kernel: ? count_memcg_events.constprop.0+0x1a/0x30
Sep 23 01:53:02 kernel: ? handle_mm_fault+0x21b/0x330
Sep 23 01:53:02 kernel: ? do_user_addr_fault+0x55a/0x7b0
Sep 23 01:53:02 kernel: ? exc_page_fault+0x7e/0x180
Sep 23 01:53:02 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
Sep 23 01:53:02 kernel: RIP: 0033:0x7f609ea9492e
Sep 23 01:53:02 kernel: Code: 48 8b 0d e5 24 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b2 24 0f 00 f7 d8 64 89 01 48
Sep 23 01:53:02 kernel: RSP: 002b:00007ffcf41c9e28 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
Sep 23 01:53:02 kernel: RAX: ffffffffffffffda RBX: 000055f55a4e80f0 RCX: 00007f609ea9492e
Sep 23 01:53:02 kernel: RDX: 00007f609d52f3bd RSI: 00000000028bcebe RDI: 00007f6098c00010
Sep 23 01:53:02 kernel: RBP: 00007ffcf41c9ee0 R08: 000055f55a493010 R09: 0000000000000007
Sep 23 01:53:02 kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 00007f609d52f3bd
Sep 23 01:53:02 kernel: R13: 0000000000020000 R14: 000055f55a4e2b20 R15: 000055f55a4f13c0
Sep 23 01:53:02 kernel: </TASK>
Sep 23 01:53:02 kernel: Modules linked in: amdgpu(+) hid_logitech_hidpp crct10dif_pclmul crc32_pclmul amdxcp i2c_algo_bit drm_ttm_helper ttm drm_exec crc32c_intel polyval_clmulni gpu_sched polyval_generic ghash_clmulni_intel drm_suballoc_helper drm_buddy sha512_ssse3 sha256_ssse3 sha1_ssse3 drm_display_helper wdat_wdt sp5100_tco cec video wmi hid_multitouch serio_raw hid_logitech_dj scsi_dh_rdac scsi_dh_emc scsi_dh_alua ip6_tables ip_tables fuse i2c_dev
Sep 23 01:53:02 kernel: CR2: 00000000000000a0
Sep 23 01:53:02 kernel: ---[ end trace 0000000000000000 ]---
Sep 23 01:53:02 kernel: RIP: 0010:dma_get_required_mask+0x15/0x50
Sep 23 01:53:02 kernel: Code: 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 38 02 00 00 48 85 c0 74 12 <48> 8b 80 a0 00 00 00 48 85 c0 74 20 e9 ba cf 00 01 cc 48 8b 05 3a
Sep 23 01:53:02 kernel: RSP: 0018:ffffacad8052f8c0 EFLAGS: 00010202
Sep 23 01:53:02 kernel: RAX: 0000000000000000 RBX: 000000ffffffffff RCX: 0000000000000027
Sep 23 01:53:02 kernel: RDX: 0000000000000000 RSI: ffffffffc17a4e03 RDI: ffff9d66816010c8
Sep 23 01:53:02 kernel: RBP: ffff9d66816010c8 R08: 0000000000000000 R09: 0000000000000000
Sep 23 01:53:02 kernel: R10: 0720072007200720 R11: 0720072007200720 R12: 0000000000000000
Sep 23 01:53:02 kernel: R13: ffff9d66816010c8 R14: 0000000000000000 R15: ffff9d6695480000
Sep 23 01:53:02 kernel: FS: 00007f609ec0a980(0000) GS:ffff9d6777480000(0000) knlGS:0000000000000000
Sep 23 01:53:02 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 23 01:53:02 kernel: CR2: 00000000000000a0 CR3: 00000001022aa000 CR4: 00000000001506f0
This problem happened 3/3 boots with 6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42. 6.11.0 didn't have this problem. I'll try to bisect.
Hardware description:
- CPU: AMD A10-9620P
- GPU: 00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Wani [Radeon R5/R6/R7 Graphics] [1002:9874] (rev ca)
- System Memory: 8 GB
- Display(s): Integrated Elan touchscreen
- Type of Display Connection: eDP
System information:
- Distro name and Version: Fedora Rawhide
- Kernel version: 6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42
- Custom kernel: N/A
- AMD official driver version: N/A
How to reproduce the issue:
- Boot a Fedora 41 KDE installation
- Log in to Plasma 6.1.5 on Wayland
- Download the Fedora Rawhide KDE live image Fedora-KDE-Live-x86_64-Rawhide-20240922.n.0.iso from https://koji.fedoraproject.org/koji/buildinfo?buildID=2550880
- Start Fedora Media Writer
- Use Fedora Media Writer to write Fedora-KDE-Live-x86_64-Rawhide-20240922.n.0.iso to a USB flash drive
- Reboot into Fedora-KDE-Live-x86_64-Rawhide-20240922.n.0.iso from the USB flash drive on a system with an affected AMD GPU
Attached files:
Log files (for system lockups / game freezes / crashes)
- Dmesg log (full log) screen-froze-amdgpu-starting-6.12.0-0.rc0.20240920gitbaeb9a7d8b60.7.fc42-1.txt
- Xorg log
- Any other log