[regression, bisected] Crash in iris_batch_flush
The Firefox telemetry registered several crashes.
0 firefox-bin mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:35
1 firefox-bin abort memory/mozalloc/mozalloc_abort.cpp:88
2 libgallium_dri.so [clone .lto_priv.0] [clone .cold] /usr/src/debug/mesa-22.1.3/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:3653
3 libsqlite3.so.0 libsqlite3.so.0@0x0000000000082fe3
4 libgallium_dri.so tc_call_flush_resource.lto_priv.0 /usr/src/debug/mesa-22.1.3/src/gallium/auxiliary/util/u_threaded_context.c:3790
5 libgallium_dri.so _fini
6 libgallium_dri.so iris_fence_flush /usr/src/debug/mesa-22.1.3/src/gallium/drivers/iris/iris_fence.c:267
7 None @0x00007f4adb9b737f
8 libgallium_dri.so st_context_flush /usr/src/debug/mesa-22.1.3/src/mesa/state_tracker/st_manager.c:808
9 libgallium_dri.so dri_flush /usr/src/debug/mesa-22.1.3/src/gallium/frontends/dri/dri_drawable.c:522
Commit 658a0c632625e1db51837ff754fe18a6a7f2ccf8 (drm/i915: don't call free_mmap_offset when purging) 1 is the likely culprit. Reverting it, fixes the issue. Though it is Mesa and Linux related according to the analysis.
Checked deeper in Mesa revisions - 21.0.3 works fine with any kernel. And kernel 5.17 works fine with any mesa. All 22+ mesas and 5.18+ kernels contains the error, but it is enoght to get only one o f error-free ancient component to get rid if the ussue. I've tried to bisect the kernel but unfortunately can not succeed so far - the kernel 5.18-rc1 is not working totally on my hadrware so there is a wide slot of code where I can not guess good or bad.
Next comment:
Reverted this kernel commit https://cgit.freedesktop.org/drm-tip/commit/?id=658a0c632625e1db51837ff754fe18a6a7f2ccf8 drm/i915: don't call free_mmap_offset when purging
Seems that the idea "lets do not free the mem here and wait when all those mem pieces will be freed later as they anyway will be freed" is a bad idea for this case.
With simple patch -R over 6.1-rc5 kernel - it stil applies without any issue. Not sure if it solved the problem completely or not but have not seen a crash anymore with latest mesa.
Anyway it is only my experience on one laptop and one gentoo setup - so it is better to give it a try on other environment.
@mwa, any ideas?
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Reporter
- Developer
Hmm, that's very strange. Is this on an integrated part? Nothing in i915 on integrated should be using obj->base.vma_node for anything, so the above patch should not make any difference. Starting from commit
cc662126b413 ("drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET")
we manage the vma_node's in i915, instead of using base.vma_node, since we need more than one node. Collapse replies - Author
I saw it on the Dell Latitude E7250 with
00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 5500 [8086:1616] (rev 09)
No idea about the other reporters over at Mozilla’s Bugzilla.
- Developer
Yeah, not currently seeing why that commit would be an issue then.
Anyway, in that Bugzilla there is mention of:
iris: Failed to submit batchbuffer: No space left on device
Which I presume means we hit -ENOSPC when submitting the execbuffer/batchbuffer. Usually that means we have either run out of GTT space (unlikely on modern HW), or that we were unable to place a buffer at a given GTT offset, since it's already being used, but normally GTT eviction should handle that. However it seems there is a known bug/issue in that area: https://lore.kernel.org/lkml/CAHzEqDnTgtB6VCprpSQfR7_tjkZV3-1dtfbiYH8_mQHCbCkY0Q@mail.gmail.com/. So wondering if that's related here...
Can we check if the issue is present before the commit: https://github.com/torvalds/linux/commit/7e00897be8bf13ef9c68c95a8e386b714c29ad95? I don't think a simple revert will work.
It is strange but reverting tour commit realy helps. Not eliminating the issue colpletely - firefox still crashes sometimes playing av1 in vaapi but very rare. It lasted for hours - while with that commit 20-30 mins of av1 youtube was almost always a crash.
Tried to put the commit back and applied https://lore.kernel.org/lkml/20221110053133.2433412-1-mani@chromium.org/ as suggested. Looks good so far, at least it did not crash during the first 30 minutes.
- Developer
Yeah, maybe it's a race which is very timing sensitive. The discussion in https://lore.kernel.org/lkml/20221110053133.2433412-1-mani@chromium.org/ revolves around contention on the object lock just as we attempt to evict it to make room in the GTT. The caller of drm_gem_free_mmap_offset() here is i915_gem_object_truncate() which is likely called by the shrinker when under memory pressure (concurrently), and requires holding the object lock when calling into that. So perhaps removing drm_gem_free_mmap_offset() just changes the timing enough such that triggering the issue becomes easier for your case...
The patch sent my Mani working perfectly - the av1 is runnimg in firefox since then non-stop and no crash yet so far. With the restored removal of drm_gem_free_mmap_offset(). So I guess https://lore.kernel.org/lkml/20221110053133.2433412-1-mani@chromium.org/ and this issue and firefox bug https://bugzilla.mozilla.org/show_bug.cgi?id=1779558 are actually the same story.
I tried applying that patch to 6.0.10 but the build fails with:
ERROR: modpost: "i915_gem_ww_unlock_single" [drivers/gpu/drm/i915/kvmgt.ko] undefined!
OK, I got it to build, but it triggers use-after-free and refcount saturation warnings.
Backtraces
------------[ cut here ]------------ refcount_t: addition on 0; use-after-free. WARNING: CPU: 9 PID: 2620 at lib/refcount.c:25 refcount_warn_saturate+0x9b/0x110 Modules linked in: ccm hid_sensor_custom_intel_hinge hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom snd_ctl_led hid_sensor_hub rfcomm snd_soc_sof_sdw snd_soc_intel_hda_dsp_common snd_sof_probes snd_soc_intel_sof_maxim_common cmac joydev algif_hash mousedev snd_soc_rt715_sdca algif_skcipher regmap_sdw_mbq snd_soc_rt1316_sdw af_alg regmap_sdw snd_soc_dmic bnep spi_pxa2xx_platform intel_ishtp_hid dw_dmac 8250_dw iTCO_wdt intel_pmc_bxt hid_multitouch iTCO_vendor_support mei_hdcp mei_pxp intel_rapl_msr dell_laptop dell_wmi dell_wmi_sysman dell_smbios firmware_attributes_class wmi_bmof dell_wmi_descriptor ledtrig_audio dcdbas intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl intel_cstate intel_uncore psmouse snd_hrtimer xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat snd_hda_codec_hdmi nf_tables snd_sof_pci_intel_tgl nfnetlink snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel br_netfilter snd_intel_dspcfg bridge snd_intel_sdw_acpi iwlmvm snd_hda_codec spi_nor stp llc mtd snd_hda_core btusb mac80211 i2c_i801 snd_hwdep btrtl i2c_smbus btbcm snd_pcm btintel mei_me libarc4 btmtk intel_lpss_pci mei intel_lpss iwlwifi idma64 bluetooth processor_thermal_device_pci ecdh_generic crc16 processor_thermal_device processor_thermal_rfim intel_ish_ipc processor_thermal_mbox intel_ishtp ucsi_acpi processor_thermal_rapl qrtr thunderbolt intel_rapl_common typec_ucsi typec igen6_edac roles wmi i2c_hid_acpi i2c_hid int3403_thermal int340x_thermal_zone intel_skl_int3472_tps68470 tps68470_regulator clk_tps68470 intel_skl_int3472_discrete intel_hid sparse_keymap int3400_thermal acpi_thermal_rel acpi_tad acpi_pad vfat fat cfg80211 rfkill mac_hid usbip_host usbip_core snd_seq_dummy pkcs8_key_parser snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_timer snd soundcore cuse dm_multipath sg crypto_user fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee usbhid dm_mod i915 serio_raw atkbd drm_buddy crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni libps2 polyval_generic vivaldi_fmap gf128mul ghash_clmulni_intel intel_gtt nvme drm_display_helper aesni_intel crypto_simd cec nvme_core spi_intel_pci cryptd xhci_pci spi_intel ttm nvme_common xhci_pci_renesas i8042 video serio CPU: 9 PID: 2620 Comm: CanvasRenderer Tainted: G S 6.0.10-zen3-1-zen #1 c85e63f598d445574c2771ff55986cdd405812ce Hardware name: Dell Inc. XPS 9320/0CW9KM, BIOS 1.9.0 09/23/2022 RIP: 0010:refcount_warn_saturate+0x9b/0x110 Code: 01 01 e8 70 aa 73 00 0f 0b c3 cc cc cc cc 80 3d 3b 97 b6 01 00 75 a8 48 c7 c7 28 d4 d4 95 c6 05 2b 97 b6 01 01 e8 4d aa 73 00 <0f> 0b c3 cc cc cc cc 80 3d 17 97 b6 01 00 75 85 48 c7 c7 58 d4 d4 RSP: 0018:ffffa59882bff7d8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffa59882bffae8 RCX: 0000000000000027 RDX: ffff8ba5ef861668 RSI: 0000000000000001 RDI: ffff8ba5ef861660 RBP: ffff8b9f0b915180 R08: 0000000000000001 R09: 00000000ffffffea R10: ffffffff9645b6a0 R11: 0000000000000002 R12: ffff8ba047b7d940 R13: 0000fffeffe97000 R14: ffff8b9f0b915180 R15: 0000000000000000 FS: 00007fea3cb7d6c0(0000) GS:ffff8ba5ef840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe9fb0242d8 CR3: 0000000152fde001 CR4: 0000000000f70ee0 PKRU: 55555554 Call Trace: <TASK> grab_vma+0x154/0x1b0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_evict_for_node+0xfc/0x2f0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_gtt_reserve+0x55/0x80 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_vma_pin_ww+0x300/0xa00 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] eb_validate_vmas+0x444/0x850 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_do_execbuffer+0xebe/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_execbuffer2_ioctl+0x119/0x280 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] drm_ioctl_kernel+0xca/0x170 drm_ioctl+0x231/0x410 ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] __x64_sys_ioctl+0x91/0xd0 do_syscall_64+0x5c/0x90 ? syscall_exit_to_user_mode+0x2c/0x1d0 ? do_syscall_64+0x6b/0x90 ? do_user_addr_fault+0x1e9/0x6c0 ? sched_clock_cpu+0xd/0xb0 ? exc_page_fault+0x74/0x170 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fea578f5c0f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007fea3cb7b860 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fea00c59560 RCX: 00007fea578f5c0f RDX: 00007fea3cb7b910 RSI: 0000000040406469 RDI: 0000000000000027 RBP: 00007fea3cb7b9a0 R08: 00007fe9dbb9f000 R09: 00007fea57600200 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fea3cb7b910 R14: 0000000000000027 R15: 00007fe9d8f7c000 </TASK> ---[ end trace 0000000000000000 ]---
------------[ cut here ]------------ refcount_t: saturated; leaking memory. WARNING: CPU: 9 PID: 2620 at lib/refcount.c:22 refcount_warn_saturate+0x55/0x110 Modules linked in: ccm hid_sensor_custom_intel_hinge hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom snd_ctl_led hid_sensor_hub rfcomm snd_soc_sof_sdw snd_soc_intel_hda_dsp_common snd_sof_probes snd_soc_intel_sof_maxim_common cmac joydev algif_hash mousedev snd_soc_rt715_sdca algif_skcipher regmap_sdw_mbq snd_soc_rt1316_sdw af_alg regmap_sdw snd_soc_dmic bnep spi_pxa2xx_platform intel_ishtp_hid dw_dmac 8250_dw iTCO_wdt intel_pmc_bxt hid_multitouch iTCO_vendor_support mei_hdcp mei_pxp intel_rapl_msr dell_laptop dell_wmi dell_wmi_sysman dell_smbios firmware_attributes_class wmi_bmof dell_wmi_descriptor ledtrig_audio dcdbas intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl intel_cstate intel_uncore psmouse snd_hrtimer xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat snd_hda_codec_hdmi nf_tables snd_sof_pci_intel_tgl nfnetlink snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel br_netfilter snd_intel_dspcfg bridge snd_intel_sdw_acpi iwlmvm snd_hda_codec spi_nor stp llc mtd snd_hda_core btusb mac80211 i2c_i801 snd_hwdep btrtl i2c_smbus btbcm snd_pcm btintel mei_me libarc4 btmtk intel_lpss_pci mei intel_lpss iwlwifi idma64 bluetooth processor_thermal_device_pci ecdh_generic crc16 processor_thermal_device processor_thermal_rfim intel_ish_ipc processor_thermal_mbox intel_ishtp ucsi_acpi processor_thermal_rapl qrtr thunderbolt intel_rapl_common typec_ucsi typec igen6_edac roles wmi i2c_hid_acpi i2c_hid int3403_thermal int340x_thermal_zone intel_skl_int3472_tps68470 tps68470_regulator clk_tps68470 intel_skl_int3472_discrete intel_hid sparse_keymap int3400_thermal acpi_thermal_rel acpi_tad acpi_pad vfat fat cfg80211 rfkill mac_hid usbip_host usbip_core snd_seq_dummy pkcs8_key_parser snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_timer snd soundcore cuse dm_multipath sg crypto_user fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee usbhid dm_mod i915 serio_raw atkbd drm_buddy crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni libps2 polyval_generic vivaldi_fmap gf128mul ghash_clmulni_intel intel_gtt nvme drm_display_helper aesni_intel crypto_simd cec nvme_core spi_intel_pci cryptd xhci_pci spi_intel ttm nvme_common xhci_pci_renesas i8042 video serio CPU: 0 PID: 2620 Comm: CanvasRenderer Tainted: G S W 6.0.10-zen3-1-zen #1 c85e63f598d445574c2771ff55986cdd405812ce Hardware name: Dell Inc. XPS 9320/0CW9KM, BIOS 1.9.0 09/23/2022 RIP: 0010:refcount_warn_saturate+0x55/0x110 Code: 84 bc 00 00 00 c3 cc cc cc cc 85 f6 74 23 80 3d 82 97 b6 01 00 75 ee 48 c7 c7 00 d4 d4 95 c6 05 72 97 b6 01 01 e8 93 aa 73 00 <0f> 0b c3 cc cc cc cc 80 3d 60 97 b6 01 00 75 cb 48 c7 c7 00 d4 d4 RSP: 0018:ffffa59882bff7d8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffffa59882bffae8 RCX: 0000000000000027 RDX: ffff8ba5ef861668 RSI: 0000000000000001 RDI: ffff8ba5ef861660 RBP: ffff8b9f0b915180 R08: 0000000000000001 R09: 00000000ffffffea R10: ffffffff9645b6a0 R11: 0000000000000002 R12: ffff8ba047b7d940 R13: ffff8b9f08a8ca18 R14: ffff8ba047b7db18 R15: ffffa59882bffb00 FS: 00007fea3cb7d6c0(0000) GS:ffff8ba5ef600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1fb633b000 CR3: 0000000152fde004 CR4: 0000000000f70ef0 PKRU: 55555554 Call Trace: <TASK> grab_vma+0x123/0x1b0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_evict_for_node+0xfc/0x2f0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_gtt_reserve+0x55/0x80 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_vma_pin_ww+0x300/0xa00 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] eb_validate_vmas+0x444/0x850 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_do_execbuffer+0xebe/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_execbuffer2_ioctl+0x119/0x280 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] drm_ioctl_kernel+0xca/0x170 drm_ioctl+0x231/0x410 ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] __x64_sys_ioctl+0x91/0xd0 do_syscall_64+0x5c/0x90 ? syscall_exit_to_user_mode+0x2c/0x1d0 ? do_syscall_64+0x6b/0x90 ? do_user_addr_fault+0x1e9/0x6c0 ? sched_clock_cpu+0xd/0xb0 ? exc_page_fault+0x74/0x170 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fea578f5c0f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007fea3cb7b860 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fea00c59560 RCX: 00007fea578f5c0f RDX: 00007fea3cb7b910 RSI: 0000000040406469 RDI: 0000000000000027 RBP: 00007fea3cb7b9a0 R08: 00007fe9dbb9f000 R09: 00007fea57600200 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fea3cb7b910 R14: 0000000000000027 R15: 00007fe9d8f7c000 </TASK> ---[ end trace 0000000000000000 ]---
------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 0 PID: 2620 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110 Modules linked in: ccm hid_sensor_custom_intel_hinge hid_sensor_als hid_sensor_trigger industrialio_triggered_buffer kfifo_buf hid_sensor_iio_common industrialio hid_sensor_custom snd_ctl_led hid_sensor_hub rfcomm snd_soc_sof_sdw snd_soc_intel_hda_dsp_common snd_sof_probes snd_soc_intel_sof_maxim_common cmac joydev algif_hash mousedev snd_soc_rt715_sdca algif_skcipher regmap_sdw_mbq snd_soc_rt1316_sdw af_alg regmap_sdw snd_soc_dmic bnep spi_pxa2xx_platform intel_ishtp_hid dw_dmac 8250_dw iTCO_wdt intel_pmc_bxt hid_multitouch iTCO_vendor_support mei_hdcp mei_pxp intel_rapl_msr dell_laptop dell_wmi dell_wmi_sysman dell_smbios firmware_attributes_class wmi_bmof dell_wmi_descriptor ledtrig_audio dcdbas intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass rapl intel_cstate intel_uncore psmouse snd_hrtimer xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype nft_compat snd_hda_codec_hdmi nf_tables snd_sof_pci_intel_tgl nfnetlink snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel br_netfilter snd_intel_dspcfg bridge snd_intel_sdw_acpi iwlmvm snd_hda_codec spi_nor stp llc mtd snd_hda_core btusb mac80211 i2c_i801 snd_hwdep btrtl i2c_smbus btbcm snd_pcm btintel mei_me libarc4 btmtk intel_lpss_pci mei intel_lpss iwlwifi idma64 bluetooth processor_thermal_device_pci ecdh_generic crc16 processor_thermal_device processor_thermal_rfim intel_ish_ipc processor_thermal_mbox intel_ishtp ucsi_acpi processor_thermal_rapl qrtr thunderbolt intel_rapl_common typec_ucsi typec igen6_edac roles wmi i2c_hid_acpi i2c_hid int3403_thermal int340x_thermal_zone intel_skl_int3472_tps68470 tps68470_regulator clk_tps68470 intel_skl_int3472_discrete intel_hid sparse_keymap int3400_thermal acpi_thermal_rel acpi_tad acpi_pad vfat fat cfg80211 rfkill mac_hid usbip_host usbip_core snd_seq_dummy pkcs8_key_parser snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_timer snd soundcore cuse dm_multipath sg crypto_user fuse ip_tables x_tables btrfs blake2b_generic libcrc32c crc32c_generic xor raid6_pq dm_crypt cbc encrypted_keys trusted asn1_encoder tee usbhid dm_mod i915 serio_raw atkbd drm_buddy crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni libps2 polyval_generic vivaldi_fmap gf128mul ghash_clmulni_intel intel_gtt nvme drm_display_helper aesni_intel crypto_simd cec nvme_core spi_intel_pci cryptd xhci_pci spi_intel ttm nvme_common xhci_pci_renesas i8042 video serio CPU: 0 PID: 2620 Comm: CanvasRenderer Tainted: G S W 6.0.10-zen3-1-zen #1 c85e63f598d445574c2771ff55986cdd405812ce Hardware name: Dell Inc. XPS 9320/0CW9KM, BIOS 1.9.0 09/23/2022 RIP: 0010:refcount_warn_saturate+0xbe/0x110 Code: 01 01 e8 4d aa 73 00 0f 0b c3 cc cc cc cc 80 3d 17 97 b6 01 00 75 85 48 c7 c7 58 d4 d4 95 c6 05 07 97 b6 01 01 e8 2a aa 73 00 <0f> 0b c3 cc cc cc cc 80 3d f2 96 b6 01 00 0f 85 5e ff ff ff 48 c7 RSP: 0018:ffffa59882bff800 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff8b9f0b915180 RCX: 0000000000000027 RDX: ffff8ba5ef621668 RSI: 0000000000000001 RDI: ffff8ba5ef621660 RBP: ffff8ba047b7d940 R08: 0000000000000001 R09: 00000000ffffffea R10: ffffffff9645b6a0 R11: 0000000000000002 R12: ffffa59882bff600 R13: 0000fffeffe97000 R14: ffff8b9fc96d9400 R15: 0000000000000000 FS: 00007fea3cb7d6c0(0000) GS:ffff8ba5ef600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1fb633b000 CR3: 0000000152fde004 CR4: 0000000000f70ef0 PKRU: 55555554 Call Trace: <TASK> ungrab_vma+0x45/0xa0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_evict_for_node+0x266/0x2f0 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_gtt_reserve+0x55/0x80 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_vma_pin_ww+0x300/0xa00 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] eb_validate_vmas+0x444/0x850 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_do_execbuffer+0xebe/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] i915_gem_execbuffer2_ioctl+0x119/0x280 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] drm_ioctl_kernel+0xca/0x170 drm_ioctl+0x231/0x410 ? i915_gem_do_execbuffer+0x2860/0x2860 [i915 47b0d499a18a3984d60286f1ebb98ec59badf95a] __x64_sys_ioctl+0x91/0xd0 do_syscall_64+0x5c/0x90 ? syscall_exit_to_user_mode+0x2c/0x1d0 ? do_syscall_64+0x6b/0x90 ? do_user_addr_fault+0x1e9/0x6c0 ? sched_clock_cpu+0xd/0xb0 ? exc_page_fault+0x74/0x170 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fea578f5c0f Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007fea3cb7b860 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007fea00c59560 RCX: 00007fea578f5c0f RDX: 00007fea3cb7b910 RSI: 0000000040406469 RDI: 0000000000000027 RBP: 00007fea3cb7b9a0 R08: 00007fe9dbb9f000 R09: 00007fea57600200 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fea3cb7b910 R14: 0000000000000027 R15: 00007fe9d8f7c000 </TASK> ---[ end trace 0000000000000000 ]---
Edited by Jan Alexander Steffens- Developer
Could you check to see if the following patch helps: https://patchwork.freedesktop.org/series/111271/
I am running 6.1-rc7 with this patch applied - so far so good. No crash seen while playing av1 in firefox using vaapi. 1+ hours of continous play. Seems that it fixes the issue.
Also looks good so far, here. I added this patch to
6.0.11-arch1
and6.0.11-zen1
.
- Suresh added platform: ADL_P label
added platform: ADL_P label
- Matthew Auld marked this issue as related to #7627 (closed)
marked this issue as related to #7627 (closed)
- Jan Alexander Steffens mentioned in issue mesa/mesa#5600
mentioned in issue mesa/mesa#5600
- Developer
Now pushed to drm-tip:
801fa7a81f6d drm/i915: improve the catch-all evict to handle lock contention
- Matthew Auld closed
closed
- Matthew Auld mentioned in commit rodrigovivi/drm-xe@e22ff1af
mentioned in commit rodrigovivi/drm-xe@e22ff1af
- Matthew Auld mentioned in commit rodrigovivi/drm-xe@801fa7a8
mentioned in commit rodrigovivi/drm-xe@801fa7a8
- Matthew Auld mentioned in commit nouveau@3f882f2d
mentioned in commit nouveau@3f882f2d
- Matthew Auld mentioned in commit igt-gpu-tools@4f22b49e
mentioned in commit igt-gpu-tools@4f22b49e
- Sultan Alsawaf mentioned in issue mesa/mesa#7239 (closed)
mentioned in issue mesa/mesa#7239 (closed)
- Matthew Auld mentioned in commit nouveau@9f9748bb
mentioned in commit nouveau@9f9748bb
- Matthew Auld mentioned in commit nouveau@ea62bd76
mentioned in commit nouveau@ea62bd76