Skip to content
Snippets Groups Projects
  • Jie1zhang's avatar
    f3304495
    drm/amdgpu/kfd: Add shared SDMA reset functionality with callback support · f3304495
    Jie1zhang authored
    
    This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
    The implementation includes the following key changes:
    
    1. Added `amdgpu_sdma_reset_queue`:
       - Resets a specific SDMA queue by instance ID.
       - Invokes registered pre-reset and post-reset callbacks to allow KFD and AMDGPU
         to save/restore their state during the reset process.
    
    2. Added `amdgpu_set_on_reset_callbacks`:
       - Allows KFD and AMDGPU to register callback functions for pre-reset and
         post-reset operations.
       - Callbacks are stored in a global linked list and invoked in the correct order
         during SDMA reset.
    
    This patch ensures that both AMDGPU and KFD can handle SDMA reset events
    gracefully, with proper state saving and restoration. It also provides a flexible
    callback mechanism for future extensions.
    
    v2: fix CamelCase and put the SDMA helper into amdgpu_sdma.c (Alex)
    
    v3: rename the `amdgpu_register_on_reset_callbacks` function to
          `amdgpu_sdma_register_on_reset_callbacks`
        move global reset_callback_list to struct amdgpu_sdma (Alex)
    
    v4: Update the reset callback function description and
       rename the reset function to amdgpu_sdma_reset_engine (Alex)
    
    Suggested-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    Suggested-by: default avatarJiadong Zhu <Jiadong.Zhu@amd.com>
    Signed-off-by: default avatarJesse Zhang <jesse.zhang@amd.com>
    Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    f3304495
    History
    drm/amdgpu/kfd: Add shared SDMA reset functionality with callback support
    Jie1zhang authored
    
    This patch introduces shared SDMA reset functionality between AMDGPU and KFD.
    The implementation includes the following key changes:
    
    1. Added `amdgpu_sdma_reset_queue`:
       - Resets a specific SDMA queue by instance ID.
       - Invokes registered pre-reset and post-reset callbacks to allow KFD and AMDGPU
         to save/restore their state during the reset process.
    
    2. Added `amdgpu_set_on_reset_callbacks`:
       - Allows KFD and AMDGPU to register callback functions for pre-reset and
         post-reset operations.
       - Callbacks are stored in a global linked list and invoked in the correct order
         during SDMA reset.
    
    This patch ensures that both AMDGPU and KFD can handle SDMA reset events
    gracefully, with proper state saving and restoration. It also provides a flexible
    callback mechanism for future extensions.
    
    v2: fix CamelCase and put the SDMA helper into amdgpu_sdma.c (Alex)
    
    v3: rename the `amdgpu_register_on_reset_callbacks` function to
          `amdgpu_sdma_register_on_reset_callbacks`
        move global reset_callback_list to struct amdgpu_sdma (Alex)
    
    v4: Update the reset callback function description and
       rename the reset function to amdgpu_sdma_reset_engine (Alex)
    
    Suggested-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    Suggested-by: default avatarJiadong Zhu <Jiadong.Zhu@amd.com>
    Signed-off-by: default avatarJesse Zhang <jesse.zhang@amd.com>
    Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>