few tests - abort/dmesg-warn - BUG kmalloc-* Poison overwritten
<6> [519.242188] Console: switching to colour dummy device 80x25
<6> [519.242425] [IGT] xe_exec_threads: executing
<6> [519.244885] [IGT] xe_exec_threads: starting subtest threads-mixed-userptr-invalidate
<3> [519.566085] =============================================================================
<3> [519.566146] BUG kmalloc-1k (Tainted: G U N): Poison overwritten
<3> [519.566167] -----------------------------------------------------------------------------
<3> [519.566194] 0xffff888209c24de0-0xffff888209c24de1 @offset=19936. First byte 0x7d instead of 0x6b
<3> [519.566219] Allocated in xe_file_open+0x2c/0x140 [xe] age=154 cpu=18 pid=2243
<4> [519.566322] kmalloc_trace+0x389/0x3c0
<4> [519.566326] xe_file_open+0x2c/0x140 [xe]
<4> [519.566368] drm_file_alloc+0x1d6/0x2b0 [drm]
<4> [519.566410] drm_open_helper+0x7f/0x160 [drm]
<4> [519.566434] drm_open+0x74/0x120 [drm]
<4> [519.566455] drm_stub_open+0xb0/0x120 [drm]
<4> [519.566480] chrdev_open+0xab/0x1f0
<4> [519.566483] do_dentry_open+0x17b/0x580
<4> [519.566485] vfs_open+0x33/0x40
<4> [519.566488] path_openat+0x6e1/0xa70
<4> [519.566491] do_filp_open+0xb0/0x120
<4> [519.566494] do_sys_openat2+0x250/0x2a0
<4> [519.566496] do_sys_open+0x46/0x80
<4> [519.566498] __x64_sys_openat+0x20/0x30
<4> [519.566500] x64_sys_call+0x1b0c/0x20c0
<4> [519.566503] do_syscall_64+0x88/0x140
<3> [519.566507] Freed in xe_file_close+0x19a/0x1e0 [xe] age=114 cpu=0 pid=2243
<4> [519.566567] kfree+0x2aa/0x320
<4> [519.566570] xe_file_close+0x19a/0x1e0 [xe]
<4> [519.566611] drm_file_free+0x21a/0x290 [drm]
<4> [519.566633] drm_close_helper.isra.0+0x73/0x80 [drm]
<4> [519.566653] drm_release_noglobal+0x25/0x80 [drm]
<4> [519.566674] __fput+0xb9/0x310
<4> [519.566676] __fput_sync+0x1a/0x20
<4> [519.566678] __x64_sys_close+0x3e/0x80
<4> [519.566680] x64_sys_call+0x18e6/0x20c0
<4> [519.566682] do_syscall_64+0x88/0x140
<4> [519.566684] entry_SYSCALL_64_after_hwframe+0x76/0x7e
<3> [519.566687] Slab 0xffffea0008270800 objects=10 used=0 fp=0xffff888209c21c00 flags=0x17ffffc0000840(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
<3> [519.566723] Object 0xffff888209c24c00 @offset=19456 fp=0xffff888209c27000
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- Reporter
A CI Bug Log filter associated to this bug has been updated by Vinay.
Description: ADL_P ATS_M DG2 LNL PVC: few tests - abort/dmesg-warn - BUG kmalloc-* Poison overwritten
Equivalent query: runconfig_tag IS IN ["xe"] AND machine_tag IS IN ["DG2", "ATSM-HW", "PVC", "256EU", "LNL", "ADL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@xe_exec_threads@threads-mixed-shared-vm-userptr-invalidate-race", "igt@kms_color@invalid-gamma-lut-sizes", "igt@xe_exec_threads@threads-hang-fd-userptr-rebind", "igt@kms_flip@flip-vs-blocking-wf-vblank", "igt@kms_vblank@crtc-id@pipe-a-edp-1", "igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-async-flip", "igt@kms_atomic_transition@plane-all-transition-fencing@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-onscreen-64x21@pipe-d-hdmi-a-1", "igt@xe_exec_threads@threads-mixed-userptr-invalidate", "igt@kms_cursor_crc@cursor-offscreen-64x64@pipe-a-hdmi-a-6", "igt@kms_flip@flip-vs-wf_vblank-interruptible@b-hdmi-a6", "igt@xe_exec_threads@threads-bal-shared-vm-rebind", "igt@kms_frontbuffer_tracking@fbc-tiling-linear", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-6", "igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move", "igt@kms_color@invalid-gamma-lut-sizes@pipe-a", "igt@kms_cursor_crc@cursor-dpms@pipe-a-hdmi-a-6", "igt@xe_ccs@suspend-resume", "igt@kms_psr@fbc-psr2-sprite-plane-move", "igt@kms_cursor_crc@cursor-dpms", "igt@xe_exec_threads@threads-bal-mixed-fd-basic", "igt@kms_async_flips@async-flip-with-page-flip-events", "igt@xe_exec_basic@many-execqueues-many-vm-userptr-invalidate", "igt@xe_exec_fault_mode@many-execqueues-basic-prefetch", "igt@kms_vblank@crtc-id", "igt@kms_mmap_write_crc@main@pipe-a-hdmi-a-6", "igt@kms_atomic_transition@plane-all-transition-fencing", "igt@kms_lease@setcrtc-implicit-plane", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt", "igt@xe_ccs@suspend-resume@linear-compressed-compfmt0-system-vram01", "igt@kms_cursor_crc@cursor-offscreen-64x64", "igt@kms_async_flips@crc", "igt@kms_hdr@bpc-switch-suspend@pipe-a-hdmi-a-6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-hdmi-a-6-4-mc-ccs", "igt@kms_color@invalid-gamma-lut-sizes@pipe-c", "igt@kms_async_flips@crc@pipe-a-hdmi-a-6", "igt@xe_exec_basic@many-execqueues-bindexecqueue-userptr-rebind", "igt@kms_mmap_write_crc@main@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-random-128x128@pipe-a-hdmi-a-1", "igt@kms_psr@fbc-psr2-sprite-plane-move@edp-1", "igt@kms_cursor_crc@cursor-random-128x128", "igt@xe_exec_threads@threads-mixed-fd-basic", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt", "igt@kms_flip@busy-flip@a-edp1", "igt@kms_universal_plane@universal-plane-functional", "igt@xe_pm@s4-vm-bind-unbind-all", "igt@kms_hdr@bpc-switch-suspend", "igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-msflip-blt", "igt@kms_vblank@query-idle-hang@pipe-a-edp-1", "igt@kms_color@ctm-negative@pipe-a", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt", "igt@kms_color@ctm-negative@pipe-b", "igt@kms_flip@flip-vs-blocking-wf-vblank@a-hdmi-a6", "igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-wc", "igt@kms_flip@2x-dpms-vs-vblank-race", "igt@xe_exec_basic@multigpu-many-execqueues-many-vm-rebind", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-1", "igt@xe_exec_balancer@once-cm-virtual-userptr", "igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-blt", "igt@kms_universal_plane@universal-plane-functional@pipe-b-dp-4", "igt@kms_force_connector_basic@force-connector-state", "igt@kms_universal_plane@universal-plane-pageflip-windowed@pipe-a-hdmi-a-6", "igt@kms_vblank@query-idle-hang", "igt@xe_exec_compute_mode@twice-bindexecqueue-userptr", "igt@kms_lease@setcrtc-implicit-plane@pipe-a-hdmi-a-6", "igt@xe_exec_threads@threads-userptr", "igt@kms_mmap_write_crc@main", "igt@kms_flip@busy-flip", "igt@xe_exec_threads@threads-shared-vm-userptr-rebind", "igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-indfb-draw-blt", "igt@xe_pm_residency@toggle-gt-c6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-edp-1-4", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-wc", "igt@xe_exec_store@basic-store", "igt@xe_exec_balancer@many-cm-parallel-userptr", "igt@kms_cursor_crc@cursor-onscreen-64x21", "igt@xe_compute@ccs-mode-basic", "igt@kms_flip@flip-vs-wf_vblank-interruptible", "igt@xe_exec_compute_mode@twice-rebind", "igt@kms_async_flips@crc@pipe-a-edp-1", "igt@kms_universal_plane@universal-plane-pageflip-windowed", "igt@xe_exec_fault_mode@twice-userptr-invalidate-prefetch", "igt@kms_color@gamma", "igt@kms_color@gamma@pipe-b", "igt@kms_color@ctm-negative"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["abort", "dmesg-warn"])) AND dmesg ~= 'BUG kmalloc-.* Poison overwritten'
New failures caught by the filter:
- Reporter
A CI Bug Log filter associated to this bug has been updated by Vinay.
Description: ADL_P ATS_M DG2 LNL PVC: few tests - abort/dmesg-warn - BUG kmalloc-* Poison overwritten
Equivalent query: runconfig_tag IS IN ["xe"] AND machine_tag IS IN ["DG2", "ATSM-HW", "PVC", "256EU", "LNL", "ADL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@xe_exec_threads@threads-mixed-shared-vm-userptr-invalidate-race", "igt@kms_color@invalid-gamma-lut-sizes", "igt@xe_exec_threads@threads-hang-fd-userptr-rebind", "igt@kms_flip@flip-vs-blocking-wf-vblank", "igt@kms_vblank@crtc-id@pipe-a-edp-1", "igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-async-flip", "igt@kms_atomic_transition@plane-all-transition-fencing@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-onscreen-64x21@pipe-d-hdmi-a-1", "igt@xe_exec_threads@threads-mixed-userptr-invalidate", "igt@kms_cursor_crc@cursor-offscreen-64x64@pipe-a-hdmi-a-6", "igt@kms_flip@flip-vs-wf_vblank-interruptible@b-hdmi-a6", "igt@xe_exec_threads@threads-bal-shared-vm-rebind", "igt@kms_frontbuffer_tracking@fbc-tiling-linear", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-6", "igt@xe_exec_threads@threads-mixed-shared-vm-basic", "igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move", "igt@kms_color@invalid-gamma-lut-sizes@pipe-a", "igt@kms_cursor_crc@cursor-dpms@pipe-a-hdmi-a-6", "igt@xe_ccs@suspend-resume", "igt@kms_psr@fbc-psr2-sprite-plane-move", "igt@kms_cursor_crc@cursor-dpms", "igt@xe_exec_threads@threads-bal-mixed-fd-basic", "igt@kms_async_flips@async-flip-with-page-flip-events", "igt@xe_exec_basic@many-execqueues-many-vm-userptr-invalidate", "igt@xe_exec_fault_mode@many-execqueues-basic-prefetch", "igt@kms_vblank@crtc-id", "igt@kms_mmap_write_crc@main@pipe-a-hdmi-a-6", "igt@kms_atomic_transition@plane-all-transition-fencing", "igt@kms_lease@setcrtc-implicit-plane", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt", "igt@xe_ccs@suspend-resume@linear-compressed-compfmt0-system-vram01", "igt@kms_cursor_crc@cursor-offscreen-64x64", "igt@kms_async_flips@crc", "igt@kms_hdr@bpc-switch-suspend@pipe-a-hdmi-a-6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-hdmi-a-6-4-mc-ccs", "igt@kms_color@invalid-gamma-lut-sizes@pipe-c", "igt@kms_async_flips@crc@pipe-a-hdmi-a-6", "igt@xe_exec_basic@many-execqueues-bindexecqueue-userptr-rebind", "igt@kms_mmap_write_crc@main@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-random-128x128@pipe-a-hdmi-a-1", "igt@kms_psr@fbc-psr2-sprite-plane-move@edp-1", "igt@kms_cursor_crc@cursor-random-128x128", "igt@xe_exec_threads@threads-mixed-fd-basic", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt", "igt@kms_flip@busy-flip@a-edp1", "igt@kms_universal_plane@universal-plane-functional", "igt@xe_pm@s4-vm-bind-unbind-all", "igt@kms_hdr@bpc-switch-suspend", "igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-msflip-blt", "igt@kms_vblank@query-idle-hang@pipe-a-edp-1", "igt@kms_color@ctm-negative@pipe-a", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt", "igt@kms_color@ctm-negative@pipe-b", "igt@kms_flip@flip-vs-blocking-wf-vblank@a-hdmi-a6", "igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-wc", "igt@kms_flip@2x-dpms-vs-vblank-race", "igt@xe_exec_basic@multigpu-many-execqueues-many-vm-rebind", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-1", "igt@xe_exec_balancer@once-cm-virtual-userptr", "igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-blt", "igt@kms_universal_plane@universal-plane-functional@pipe-b-dp-4", "igt@kms_force_connector_basic@force-connector-state", "igt@kms_universal_plane@universal-plane-pageflip-windowed@pipe-a-hdmi-a-6", "igt@kms_vblank@query-idle-hang", "igt@xe_exec_compute_mode@twice-bindexecqueue-userptr", "igt@kms_lease@setcrtc-implicit-plane@pipe-a-hdmi-a-6", "igt@xe_exec_threads@threads-userptr", "igt@kms_mmap_write_crc@main", "igt@kms_flip@busy-flip", "igt@xe_exec_threads@threads-shared-vm-userptr-rebind", "igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-indfb-draw-blt", "igt@xe_pm_residency@toggle-gt-c6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-edp-1-4", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-wc", "igt@xe_exec_store@basic-store", "igt@xe_exec_balancer@many-cm-parallel-userptr", "igt@kms_cursor_crc@cursor-onscreen-64x21", "igt@xe_compute@ccs-mode-basic", "igt@kms_flip@flip-vs-wf_vblank-interruptible", "igt@xe_exec_compute_mode@twice-rebind", "igt@kms_async_flips@crc@pipe-a-edp-1", "igt@kms_universal_plane@universal-plane-pageflip-windowed", "igt@xe_exec_fault_mode@twice-userptr-invalidate-prefetch", "igt@kms_color@gamma", "igt@kms_color@gamma@pipe-b", "igt@kms_color@ctm-negative"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["abort", "dmesg-warn"])) AND dmesg ~= 'BUG kmalloc-.* Poison overwritten'
New failures caught by the filter:
- Reporter
A CI Bug Log filter associated to this bug has been updated by Vinay.
Description: ADL_P ATS_M DG2 LNL PVC: few tests - abort/dmesg-warn - BUG kmalloc-* Poison overwritten
Equivalent query: runconfig_tag IS IN ["xe"] AND machine_tag IS IN ["DG2", "ATSM-HW", "PVC", "256EU", "LNL", "ADL-P"] AND ((testsuite_name = "IGT" AND test_name IS IN ["igt@xe_exec_threads@threads-mixed-shared-vm-userptr-invalidate-race", "igt@kms_color@invalid-gamma-lut-sizes", "igt@xe_exec_threads@threads-hang-fd-userptr-rebind", "igt@kms_flip@flip-vs-blocking-wf-vblank", "igt@kms_vblank@crtc-id@pipe-a-edp-1", "igt@kms_big_fb@y-tiled-max-hw-stride-32bpp-rotate-0-async-flip", "igt@kms_atomic_transition@plane-all-transition-fencing@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-onscreen-64x21@pipe-d-hdmi-a-1", "igt@xe_exec_threads@threads-mixed-userptr-invalidate", "igt@kms_cursor_crc@cursor-offscreen-64x64@pipe-a-hdmi-a-6", "igt@kms_flip@flip-vs-wf_vblank-interruptible@b-hdmi-a6", "igt@xe_exec_threads@threads-bal-shared-vm-rebind", "igt@kms_frontbuffer_tracking@fbc-tiling-linear", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-6", "igt@xe_exec_threads@threads-mixed-shared-vm-basic", "igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-move", "igt@kms_color@invalid-gamma-lut-sizes@pipe-a", "igt@xe_exec_compute_mode@twice-basic", "igt@kms_cursor_crc@cursor-dpms@pipe-a-hdmi-a-6", "igt@xe_ccs@suspend-resume", "igt@kms_psr@fbc-psr2-sprite-plane-move", "igt@kms_cursor_crc@cursor-dpms", "igt@xe_exec_threads@threads-bal-mixed-fd-basic", "igt@kms_cursor_crc@cursor-dpms", "igt@kms_async_flips@async-flip-with-page-flip-events", "igt@xe_exec_basic@many-execqueues-many-vm-userptr-invalidate", "igt@xe_exec_fault_mode@many-execqueues-basic-prefetch", "igt@kms_vblank@crtc-id", "igt@kms_mmap_write_crc@main@pipe-a-hdmi-a-6", "igt@kms_atomic_transition@plane-all-transition-fencing", "igt@kms_lease@setcrtc-implicit-plane", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-pri-indfb-draw-blt", "igt@xe_ccs@suspend-resume@linear-compressed-compfmt0-system-vram01", "igt@kms_cursor_crc@cursor-offscreen-64x64", "igt@kms_async_flips@crc", "igt@kms_hdr@bpc-switch-suspend@pipe-a-hdmi-a-6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-hdmi-a-6-4-mc-ccs", "igt@kms_color@invalid-gamma-lut-sizes@pipe-c", "igt@kms_async_flips@crc@pipe-a-hdmi-a-6", "igt@xe_exec_basic@many-execqueues-bindexecqueue-userptr-rebind", "igt@kms_mmap_write_crc@main@pipe-a-edp-1", "igt@kms_cursor_crc@cursor-random-128x128@pipe-a-hdmi-a-1", "igt@kms_psr@fbc-psr2-sprite-plane-move@edp-1", "igt@kms_cursor_crc@cursor-random-128x128", "igt@xe_exec_threads@threads-mixed-fd-basic", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-blt", "igt@kms_flip@busy-flip@a-edp1", "igt@kms_universal_plane@universal-plane-functional", "igt@xe_pm@s4-vm-bind-unbind-all", "igt@kms_hdr@bpc-switch-suspend", "igt@kms_frontbuffer_tracking@fbc-1p-pri-indfb-multidraw", "igt@kms_frontbuffer_tracking@fbc-1p-primscrn-shrfb-msflip-blt", "igt@kms_vblank@query-idle-hang@pipe-a-edp-1", "igt@kms_color@ctm-negative@pipe-a", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-shrfb-plflip-blt", "igt@kms_color@ctm-negative@pipe-b", "igt@kms_flip@flip-vs-blocking-wf-vblank@a-hdmi-a6", "igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-mmap-wc", "igt@kms_flip@2x-dpms-vs-vblank-race", "igt@xe_exec_basic@multigpu-many-execqueues-many-vm-rebind", "igt@kms_universal_plane@universal-plane-functional@pipe-a-hdmi-a-1", "igt@xe_exec_balancer@once-cm-virtual-userptr", "igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-draw-blt", "igt@kms_universal_plane@universal-plane-functional@pipe-b-dp-4", "igt@kms_force_connector_basic@force-connector-state", "igt@kms_universal_plane@universal-plane-pageflip-windowed@pipe-a-hdmi-a-6", "igt@kms_vblank@query-idle-hang", "igt@xe_exec_compute_mode@twice-bindexecqueue-userptr", "igt@kms_lease@setcrtc-implicit-plane@pipe-a-hdmi-a-6", "igt@xe_exec_threads@threads-userptr", "igt@kms_mmap_write_crc@main", "igt@kms_flip@busy-flip", "igt@xe_exec_threads@threads-shared-vm-userptr-rebind", "igt@kms_frontbuffer_tracking@fbc-1p-offscren-pri-indfb-draw-blt", "igt@xe_pm_residency@toggle-gt-c6", "igt@kms_async_flips@async-flip-with-page-flip-events@pipe-a-edp-1-4", "igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-wc", "igt@xe_exec_store@basic-store", "igt@xe_exec_balancer@many-cm-parallel-userptr", "igt@kms_cursor_crc@cursor-onscreen-64x21", "igt@xe_compute@ccs-mode-basic", "igt@kms_flip@flip-vs-wf_vblank-interruptible", "igt@xe_exec_compute_mode@twice-rebind", "igt@kms_async_flips@crc@pipe-a-edp-1", "igt@kms_universal_plane@universal-plane-pageflip-windowed", "igt@xe_exec_fault_mode@twice-userptr-invalidate-prefetch", "igt@kms_color@gamma", "igt@kms_color@gamma@pipe-b", "igt@kms_color@ctm-negative"])) AND ((testsuite_name = "IGT" AND status_name IS IN ["abort", "dmesg-warn"])) AND dmesg ~= 'BUG kmalloc-.* Poison overwritten'
New failures caught by the filter:
- Developer
Likely culprit is: https://patchwork.freedesktop.org/patch/594577/?series=132477&rev=5
AFAICT you can also see the same failure signature in the pre-merge results for that series.
The patch extends the xef structure with a new field, so likely we are accessing that new field after the xef structure is freed. Most likely the guc_exec_queue_free_job can race with xe_file_close.
- Matthew Auld assigned to @demarchi
assigned to @demarchi
- Rodrigo Vivi assigned to @unerlige and unassigned @demarchi
- Maintainer
@unerlige could you please help in this UAF regression? I could confirm that @mwa is absolutely right... a simple
--- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -762,8 +762,13 @@ guc_exec_queue_run_job(struct drm_sched_job *drm_job) static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); + struct xe_device *xe = gt_to_xe(job->q->gt); - xe_exec_queue_update_runtime(job->q); + /* Do not update exec_queue runtime info if the client is already gone */ + spin_lock(&xe->clients.lock); + if (xe->clients.count) + xe_exec_queue_update_runtime(job->q); + spin_unlock(&xe->clients.lock);
could do the trick and avoid the UAF reported above. However I got confused on the whole flow. should this check be inside xe_exec_queue_update_runtime() itself to avoid other races? but then if inside, what's the relation with if (!q->vm || !q->vm->xef) Was that an attempt to solve/mask exactly this behavior here? Is it still needed if we we are checking the clients.count?
Also, is there a better name for this thing? runtime what?! "update runtime", "save runtime", "runtime in ticks" runtime what? ticks of what? Isn't there a better name that really has some meaning?
Collapse replies - Developer
AFAICT clients.count is just the count of all clients. Here we would need some kind synchronisation for a single client going away. Similar to your idea, it looks like we could in theory update the vm->xef pointer to NULL under the lock in xe_file_close() somewhere and then in xe_exec_queue_update_runtime() you grab the lock before checking if the vm->xef pointer is NULL and then keep it held until you are done touching the pointer.
- Developer
Also chatted to @mbrost about this who said that @dceraolo was also facing a similar type of issue and a possible fix there was to make the queue kill callback accept a flag to forcefully ensure we finish calling the free_job for all jobs on that queue before it returns, which we could then make use of in the xe_file_close() case.
fwiu, the suggestion is to wait in xe_exec_queue_kill() for guc_exec_queue_free_job() to complete. @mwa What condition should I wait on to ensure that? Any thoughts?
From the above analysis, it looks like the race is occurring in the xe_file_close flow because queue is killed a as a scheduled work and then xef is closed in parallel. If so, I would just do this:
--- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -763,7 +763,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job *drm_job) { struct xe_sched_job *job = to_xe_sched_job(drm_job); - xe_exec_queue_update_runtime(job->q); + if (!exec_queue_killed(job->q)) + xe_exec_queue_update_runtime(job->q); trace_xe_sched_job_free(job); xe_sched_job_put(job);
Edited by unerligeCollapse replies - Developer
You could potentially get an interrupt or be pre-empted after checking queue_killed, and during that time you can race with xe_file_close killing the queue and freeing the xef pointer. Or is that somehow not possible?
I'm not really sure how to do the "wait in xe_exec_queue_kill() for guc_exec_queue_free_job() to complete".
- Matthew Auld marked #1939 (closed) as a duplicate of this issue
marked #1939 (closed) as a duplicate of this issue
- Matthew Auld marked this issue as related to #1939 (closed)
marked this issue as related to #1939 (closed)
Posted a fix here - https://patchwork.freedesktop.org/series/134030/. Waiting for CI
Final patch is posted here - https://patchwork.freedesktop.org/patch/595493/?series=134037, but I have skipped CI on this one since it has only a rename.
Waiting for the FULL CI run here - https://patchwork.freedesktop.org/series/134033/.
Edited by unerlige- Reporter
The CI Bug Log issue associated to this bug has been updated by adelaryb.
New filters associated
- ATS_M PVC ADL_P DG2 LNL: few tests - abort/incomplete - BUG kmalloc-1k*Poison overwritten (No new failures associated)
- Reporter
The CI Bug Log issue associated to this bug has been archived.
New failures matching the above filters will not be associated to this bug anymore.