Crash in Firefox in amdgpu_add_bo_fence_dependencies
We are seeing some UAF crashes in amdgpu_add_bo_fence_dependencies from Firefox crash reports, approximately 5 crashes per day.
Backtrace is:
0 libgallium_dri.so amdgpu_add_bo_fence_dependencies /usr/src/debug/mesa/mesa-23.0.1/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:1267
0 libgallium_dri.so amdgpu_add_fence_dependencies_bo_list /usr/src/debug/mesa/mesa-23.0.1/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:1350
1 libgallium_dri.so amdgpu_add_fence_dependencies_bo_lists /usr/src/debug/mesa/mesa-23.0.1/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:1361
1 libgallium_dri.so amdgpu_cs_submit_ib /usr/src/debug/mesa/mesa-23.0.1/src/gallium/winsys/amdgpu/drm/amdgpu_cs.c:1424
2 libgallium_dri.so util_queue_thread_func /usr/src/debug/mesa/mesa-23.0.1/src/util/u_queue.c:309
3 libgallium_dri.so impl_thrd_routine /usr/src/debug/mesa/mesa-23.0.1/src/c11/impl/threads_posix.c:67
4 firefox set_alt_signal_stack_and_start toolkit/crashreporter/pthread_create_interposer/pthread_create_interposer.cpp:80
5 libc.so.6 start_thread /usr/src/debug/glibc/glibc/nptl/pthread_create.c:444
6 libc.so.6 __clone3 /usr/src/debug/glibc/glibc/sysdeps/unix/sysv/linux/x86_64/clone3.S:81
Some repro information from a user:
The crash only occurs when the Ambient Light for Youtube extension is installed and enabled. Most of the times only firefox crashes, but sometimes amdgpu crashes, causing a GPU reset:
kernel: amdgpu 0000:0b:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32788, for process firefox pid 14832 thread firefox:cs0 pid 14896)
kernel: amdgpu 0000:0b:00.0: amdgpu: in page starting at address 0x000080010a615000 from client 0x1b (UTCL2)
kernel: amdgpu 0000:0b:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401031
kernel: amdgpu 0000:0b:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8)
kernel: amdgpu 0000:0b:00.0: amdgpu: MORE_FAULTS: 0x1
kernel: amdgpu 0000:0b:00.0: amdgpu: WALKER_ERROR: 0x0
kernel: amdgpu 0000:0b:00.0: amdgpu: PERMISSION_FAULTS: 0x3
kernel: amdgpu 0000:0b:00.0: amdgpu: MAPPING_ERROR: 0x0
kernel: amdgpu 0000:0b:00.0: amdgpu: RW: 0x0