[regression, bisected] Crash in iris_batch_flush
The Firefox telemetry registered several crashes.
0 firefox-bin mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:35
1 firefox-bin abort memory/mozalloc/mozalloc_abort.cpp:88
2 libgallium_dri.so [clone .lto_priv.0] [clone .cold] /usr/src/debug/mesa-22.1.3/src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:3653
3 libsqlite3.so.0 libsqlite3.so.0@0x0000000000082fe3
4 libgallium_dri.so tc_call_flush_resource.lto_priv.0 /usr/src/debug/mesa-22.1.3/src/gallium/auxiliary/util/u_threaded_context.c:3790
5 libgallium_dri.so _fini
6 libgallium_dri.so iris_fence_flush /usr/src/debug/mesa-22.1.3/src/gallium/drivers/iris/iris_fence.c:267
7 None @0x00007f4adb9b737f
8 libgallium_dri.so st_context_flush /usr/src/debug/mesa-22.1.3/src/mesa/state_tracker/st_manager.c:808
9 libgallium_dri.so dri_flush /usr/src/debug/mesa-22.1.3/src/gallium/frontends/dri/dri_drawable.c:522
Commit 658a0c632625e1db51837ff754fe18a6a7f2ccf8 (drm/i915: don't call free_mmap_offset when purging) 1 is the likely culprit. Reverting it, fixes the issue. Though it is Mesa and Linux related according to the analysis.
Checked deeper in Mesa revisions - 21.0.3 works fine with any kernel. And kernel 5.17 works fine with any mesa. All 22+ mesas and 5.18+ kernels contains the error, but it is enoght to get only one o f error-free ancient component to get rid if the ussue. I've tried to bisect the kernel but unfortunately can not succeed so far - the kernel 5.18-rc1 is not working totally on my hadrware so there is a wide slot of code where I can not guess good or bad.
Next comment:
Reverted this kernel commit https://cgit.freedesktop.org/drm-tip/commit/?id=658a0c632625e1db51837ff754fe18a6a7f2ccf8 drm/i915: don't call free_mmap_offset when purging
Seems that the idea "lets do not free the mem here and wait when all those mem pieces will be freed later as they anyway will be freed" is a bad idea for this case.
With simple patch -R over 6.1-rc5 kernel - it stil applies without any issue. Not sure if it solved the problem completely or not but have not seen a crash anymore with latest mesa.
Anyway it is only my experience on one laptop and one gentoo setup - so it is better to give it a try on other environment.
@mwa, any ideas?