GPU hang file takes so long to read that the file gets deleted before I can copy it
Hello
I'm trying to debug a GPU hang in Spider-Man (mesa/mesa#11526), but it takes so long to copy the /sys/class/drm/card0/device/devcoredump/data
file to the file system that the file gets deleted by xe 0000:00:02.0: [drm] Xe device coredump has been deleted.
before my cp
or cat
finished. I haven't measured the exact speed, but it's in the range of kilobytes per MINUTE. Something really really wrong is going on here.
This is on LNL.
Edit: I was able to copy 46mb before the copy died. I tried with both 'cat' and 'cp' and I was always sending it to /root, so the transfer was not over serial console or baud modems.
Cc: @zehortigoza
Designs ...
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Paulo Zanoni changed title from devcoredump takes so long to read that the file gets deleted before I can copy it to GPU hang file takes so long to read that the file gets deleted before I can copy it
changed title from devcoredump takes so long to read that the file gets deleted before I can copy it to GPU hang file takes so long to read that the file gets deleted before I can copy it
- Paulo Zanoni changed the description
changed the description
- Author Reporter
It takes about 6 seconds for each 0.1MB, so about a 1MB per minute:
root@martianriver:~# while true; do ls -lh spiderman-gpu-hang3.data; sleep 1; done -rw------- 1 root root 1.4M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.4M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.4M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.5M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.6M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.6M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.6M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.6M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.6M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.7M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.8M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 1.9M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.0M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.0M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.0M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.0M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.0M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.1M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.2M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.2M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.2M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.2M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.2M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.3M Jul 25 16:40 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.5M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.6M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.6M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.6M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.6M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.6M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.7M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.8M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.9M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.9M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.9M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.9M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 2.9M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.0M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.1M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.2M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.3M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.4M Jul 25 16:41 spiderman-gpu-hang3.data -rw------- 1 root root 3.4M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.4M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.4M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.5M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.6M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.7M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.7M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.7M Jul 25 16:42 spiderman-gpu-hang3.data -rw------- 1 root root 3.7M Jul 25 16:42 spiderman-gpu-hang3.data
Collapse replies - Author Reporter
Also the 'cp' process is eating 100% of a CPU while it's trying to copy the data.
- Developer
Ok, I think I see the problem.
I was able to created this by triggering a hang with 4k VMA sized 8k each.
It took about 40s to read the devcoredump.
root@DUT025-TGLU:drivers.gpu.i915.igt-gpu-tools# cat /sys/class/drm/card0/device/devcoredump/data > coredump.txt; ls -la coredump.txt -rw-r--r-- 1 root root 8723149 Jul 25 23:21 coredump.txt
I see xe_devcoredump_read called 192x times with a count of 4k. Each time xe_vm_snapshot_print is called looping over all VMAs and every byte of data in the VMA in 4 byte increments.
I'm assuming only a faction of the data is actually output each time because the looping structure this takes forever. This is O(N*N) where N is the size of the capture. In other words, the larger the capture output in terms of bytes per second decreases.
There has to be better way to implement xe_devcoredump_read and make this O(n).
Edited by Matthew Brost - Developer
I think roughly to fix this:
- In the delayed worker after we get the snap, print to buffer into human readable buffer
- Optionally compress
- In xe_devcoredump_read copy directly from human readable buffer / or compressed buffer in devcoredump data pointer based on offset / count
- Developer
Here is a fix [1] without compression, we can add that in later if desired.
My test case went from 40s to instantaneous.
Collapse replies - Author Reporter
root@martianriver:~# time cp /sys/class/drm/card0/device/devcoredump/data gpu-hang.data real 0m0.313s user 0m0.008s sys 0m0.298s root@martianriver:~# ls -lh gpu-hang.data -rw------- 1 root root 221M Jul 26 14:47 gpu-hang.data
Going from an estimated 221 minutes to 0.3 seconds, I'd say it's an improvement.
1 - Developer
Hopefully we can get this in pretty quick, Rodrigo wrote this and is out for 10 days or so but he gave me thumbs on my rough plan before he left.
- Author Reporter
No problem, I'll just keep carrying this patch locally.
Let's just make sure when people have LNL they also have this patch. I'm not sure it's going to need to cc stable.
- Author Reporter
But I just noticed this in dmesg:
[ 657.271247] ------------[ cut here ]------------ [ 657.271271] refcount_t: underflow; use-after-free. [ 657.271285] WARNING: CPU: 3 PID: 1751 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 [ 657.271292] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_seq_device qrtr rfkill binfmt_misc nls_ascii nls_cp437 vfat fat intel_uncore_frequency intel_uncore_frequency_common x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm snd_sof_pci_intel_lnl snd_sof_pci_intel_mtl snd_sof_intel_hda_generic snd_sof_pci snd_sof_xtensa_dsp snd_sof_intel_hda_common snd_soc_hdac_hda crc32_pclmul snd_sof_intel_hda crc32c_intel snd_sof snd_sof_utils ghash_clmulni_intel snd_soc_acpi_intel_match sha512_ssse3 snd_soc_acpi sha256_ssse3 snd_soc_core sha1_ssse3 intel_rapl_msr snd_compress snd_sof_intel_hda_mlink snd_hda_ext_core snd_hda_intel snd_intel_dspcfg snd_hda_codec aesni_intel snd_hwdep crypto_simd snd_hda_core cryptd snd_pcm processor_thermal_device_pci wmi_bmof rapl pcspkr processor_thermal_device snd_timer processor_thermal_wt_hint processor_thermal_rfim snd processor_thermal_rapl mei_me thunderbolt ucsi_acpi mei intel_rapl_common soundcore typec_ucsi roles processor_thermal_wt_req processor_thermal_power_floor [ 657.271342] processor_thermal_mbox typec button battery int3403_thermal int340x_thermal_zone intel_pmc_core intel_skl_int3472_tps68470 intel_hid int3400_thermal sparse_keymap acpi_thermal_rel intel_vsec joydev pmt_telemetry acpi_pad pmt_class acpi_tad intel_skl_int3472_discrete evdev msr parport_pc ppdev lp parport configfs efi_pstore nfnetlink efivarfs ip_tables x_tables autofs4 ax88796b asix phylink selftests usbnet mii libphy hid_generic usbhid hid xe nvme nvme_core drm_ttm_helper t10_pi ttm i2c_algo_bit gpu_sched drm_buddy drm_suballoc_helper drm_gpuvm drm_exec crc64_rocksoft crc64 drm_display_helper crc_t10dif crct10dif_generic xhci_pci drm_kms_helper xhci_hcd crct10dif_pclmul crct10dif_common drm usbcore intel_lpss_pci intel_lpss intel_ish_ipc idma64 intel_ishtp usb_common video wmi fan [ 657.271388] CPU: 3 PID: 1751 Comm: Xorg Not tainted 6.10.0-rc7pz+ #45 [ 657.271390] Hardware name: Intel Corporation Lunar Lake Client Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3093.D87.2403190644 03/19/2024 [ 657.271391] RIP: 0010:refcount_warn_saturate+0xbe/0x110 [ 657.271394] Code: 01 01 e8 55 e6 92 ff 0f 0b c3 cc cc cc cc 80 3d 69 cf 65 01 00 75 85 48 c7 c7 78 60 30 b3 c6 05 59 cf 65 01 01 e8 32 e6 92 ff <0f> 0b c3 cc cc cc cc 80 3d 47 cf 65 01 00 0f 85 5e ff ff ff 48 c7 [ 657.271395] RSP: 0000:ffffb30d45463828 EFLAGS: 00010282 [ 657.271397] RAX: 0000000000000000 RBX: ffff8f460a7d3800 RCX: 0000000000000027 [ 657.271398] RDX: ffff8f48af8dc9c8 RSI: 0000000000000001 RDI: ffff8f48af8dc9c0 [ 657.271399] RBP: ffffb30d454639d0 R08: 0000000000000000 R09: 0000000000000003 [ 657.271400] R10: ffffb30d454636c0 R11: ffffffffb39efd68 R12: ffff8f460a7d3800 [ 657.271401] R13: 00000000fffffe00 R14: 0000000000000001 R15: 0000000000000000 [ 657.271402] FS: 00007f008a564580(0000) GS:ffff8f48af8c0000(0000) knlGS:0000000000000000 [ 657.271403] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 657.271404] CR2: 0000224700cdd488 CR3: 0000000131de2003 CR4: 0000000000f70ef0 [ 657.271405] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 657.271405] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400 [ 657.271406] PKRU: 55555554 [ 657.271407] Call Trace: [ 657.271408] <TASK> [ 657.271409] ? __warn+0x8c/0x180 [ 657.271412] ? refcount_warn_saturate+0xbe/0x110 [ 657.271414] ? report_bug+0x164/0x190 [ 657.271418] ? handle_bug+0x3a/0x70 [ 657.271421] ? exc_invalid_op+0x17/0x70 [ 657.271422] ? asm_exc_invalid_op+0x1a/0x20 [ 657.271428] ? refcount_warn_saturate+0xbe/0x110 [ 657.271431] ? refcount_warn_saturate+0xbe/0x110 [ 657.271432] xe_sync_entry_cleanup+0xc9/0xf0 [xe] [ 657.271508] xe_vm_bind_ioctl+0x1659/0x2340 [xe] [ 657.271562] ? find_held_lock+0x2b/0x80 [ 657.271570] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 657.271621] ? drm_ioctl_kernel+0xb5/0x110 [drm] [ 657.271652] drm_ioctl_kernel+0xb5/0x110 [drm] [ 657.271670] drm_ioctl+0x27a/0x4e0 [drm] [ 657.271687] ? __pfx_xe_vm_bind_ioctl+0x10/0x10 [xe] [ 657.271752] xe_drm_ioctl+0x56/0x80 [xe] [ 657.271784] __x64_sys_ioctl+0x94/0xd0 [ 657.271788] do_syscall_64+0x90/0x1a0 [ 657.271789] ? drm_syncobj_fence_get+0x1d/0x1f0 [drm] [ 657.271806] ? find_held_lock+0x2b/0x80 [ 657.271809] ? __lock_acquire+0x41b/0x2570 [ 657.271812] ? lock_acquire+0xc0/0x2c0 [ 657.271814] ? select_task_rq_fair+0x141/0x1d40 [ 657.271817] ? find_held_lock+0x2b/0x80 [ 657.271818] ? select_task_rq_fair+0x234/0x1d40 [ 657.271820] ? lock_release+0xbf/0x280 [ 657.271822] ? select_task_rq_fair+0x23e/0x1d40 [ 657.271825] ? lock_acquire+0xc0/0x2c0 [ 657.271826] ? try_to_wake_up+0x53/0x8a0 [ 657.271827] ? find_held_lock+0x2b/0x80 [ 657.271829] ? try_to_wake_up+0x1f0/0x8a0 [ 657.271830] ? lock_release+0xbf/0x280 [ 657.271832] ? find_held_lock+0x2b/0x80 [ 657.271834] ? rcu_core+0x2e2/0x3e0 [ 657.271836] ? lockdep_hardirqs_on_prepare+0xa7/0x190 [ 657.271838] ? rcu_nocb_unlock_irqrestore+0x5d/0x70 [ 657.271840] ? rcu_core+0x19f/0x3e0 [ 657.271842] ? trace_hardirqs_off+0x4b/0xc0 [ 657.271844] ? handle_softirqs+0x437/0x450 [ 657.271848] ? lockdep_hardirqs_on_prepare+0xda/0x190 [ 657.271850] entry_SYSCALL_64_after_hwframe+0x71/0x79 [ 657.271852] RIP: 0033:0x7f008a91071b [ 657.271854] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 [ 657.271855] RSP: 002b:00007ffe68a8b1f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 657.271857] RAX: ffffffffffffffda RBX: 00007ffe68a8b2b0 RCX: 00007f008a91071b [ 657.271858] RDX: 00007ffe68a8b2b0 RSI: 0000000040886445 RDI: 0000000000000012 [ 657.271859] RBP: 00007ffe68a8b370 R08: 0000000000030000 R09: 000055c01d565f20 [ 657.271860] R10: 0000fffeff000000 R11: 0000000000000246 R12: 0000000000000012 [ 657.271860] R13: 000055c01b8b16b8 R14: 000055c01b8b1230 R15: 000055c01d671f80 [ 657.271864] </TASK> [ 657.271865] irq event stamp: 14799383 [ 657.271866] hardirqs last enabled at (14799389): [<ffffffffb1f5553b>] console_unlock+0x11b/0x140 [ 657.271869] hardirqs last disabled at (14799394): [<ffffffffb1f55520>] console_unlock+0x100/0x140 [ 657.271870] softirqs last enabled at (14799166): [<ffffffffb1ea28ea>] __irq_exit_rcu+0x9a/0xc0 [ 657.271872] softirqs last disabled at (14799161): [<ffffffffb1ea28ea>] __irq_exit_rcu+0x9a/0xc0 [ 657.271873] ---[ end trace 0000000000000000 ]---
- Author Reporter
Not sure it's related to the GPU hang though. Will re-test.
- Author Reporter
It does not happen reliably. I just realized it matches what I had already pasted in mesa/mesa#11526. I'll open a separate report for it.
- Author Reporter
- Paulo Zanoni mentioned in issue mesa/mesa#11526
mentioned in issue mesa/mesa#11526
- Matthew Brost closed with commit 4f04d07c
closed with commit 4f04d07c