drm/amdgpu: Introduce conditional user queue suspension for SDMA resets
- Modify the `amdgpu_sdma_reset_engine` function to accept a `suspend_user_queues` parameter. - This parameter allows the function to conditionally suspend and resume user queues during SDMA resets. - Ensure that user queues are suspended only when necessary to avoid unnecessary overhead and potential deadlocks. - Restart the scheduler's work queue for the GFX and page rings after the reset to allow new tasks to be submitted. This change improves synchronization between the KGD and the KFD during SDMA resets, ensuring proper handling of user queues and avoiding race conditions. V2: replace the ring_lock with the existed the scheduler locks for the queues (ring->sched) on the sdma engine.(Alex) v3: call drm_sched_wqueue_stop() rather than job_list_lock. If a GPU ring reset was already initiated for one ring at amdgpu_job_timedout, skip resetting that ring and call drm_sched_wqueue_stop() for the other rings (Alex) replace the common lock (sdma_reset_lock) with DQM lock to to resolve reset races between the two driver sections during KFD eviction.(Jon) Rename the caller to Reset_src and Change AMDGPU_RESET_SRC_SDMA_KGD/KFD to AMDGPU_RESET_SRC_SDMA_HWS/RING (Jon) v4: restart the wqueue if the reset was successful, or fall back to a full adapter reset. (Alex) move definition of reset source to enumeration AMDGPU_RESET_SRCS, and check reset src in amdgpu_sdma_reset_instance (Jon) v5: Call amdgpu_amdkfd_suspend/resume at the start/end of reset function respectively under !SRC_HWS conditions only (Jon) v6: replace the paramter src with a bool suspend_user_queues, remove the paramter src in pre/post func. (Jon) Suggested-by:Alex Deucher <alexander.deucher@amd.com> Suggested-by:
Jiadong Zhu <Jiadong.Zhu@amd.com> Suggested-by:
Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by:
Jesse Zhang <jesse.zhang@amd.com> Acked-by:
Jonathan Kim <jonathan.kim@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
Showing
- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 50 additions, 6 deletionsdrivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h 1 addition, 1 deletiondrivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c 3 additions, 1 deletiondrivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c
Loading