1. 16 Sep, 2021 3 commits
    • Andrey Grodzovsky's avatar
      drm/amdgpu: Fix crash on device remove/driver unload · cdccf1ff
      Andrey Grodzovsky authored
      BUG: unable to handle page fault for address: 00000000000010e1
      RIP: 0010:vega10_power_gate_vce+0x26/0x50 [amdgpu]
      Call Trace:
      pp_set_powergating_by_smu+0x16a/0x2b0 [amdgpu]
      amdgpu_dpm_set_powergating_by_smu+0x92/0xf0 [amdgpu]
      amdgpu_dpm_enable_vce+0x2e/0xc0 [amdgpu]
      vce_v4_0_hw_fini+0x95/0xa0 [amdgpu]
      amdgpu_device_fini_hw+0x232/0x30d [amdgpu]
      amdgpu_driver_unload_kms+0x5c/0x80 [amdgpu]
      amdgpu_pci_remove+0x27/0x40 [amdgpu]
      VCE/UVD had dependency on SMC block for their suspend but
      SMC block is the first to do HW fini due to some constraints
      Since the original patch was dealing with suspend issues
      move the SMC block dependency back into suspend hooks as
      was done in V1 of the original patches.
      Keep flushing idle work both in suspend and HW fini seuqnces
      since it's essential in both cases.
      2178d3c1 drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend
       drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend
      Signed-off-by: Andrey Grodzovsky's avatarAndrey Grodzovsky <andrey.grodzovsky@amd.com>
    • xinhui pan's avatar
      drm/amdgpu: Fix uvd ib test timeout when use pre-allocated BO · 4567162f
      xinhui pan authored
      Now we use same BO for create/destroy msg. So destroy will wait for the
      fence returned from create to be signaled. The default timeout value in
      destroy is 10ms which is too short.
      Lets wait both fences with the specific timeout.
      Signed-off-by: default avatarxinhui pan <xinhui.pan@amd.com>
      Reviewed-by: Christian König's avatarChristian König <christian.koenig@amd.com>
    • xinhui pan's avatar
      drm/amdgpu: Put drm_dev_enter/exit outside hot codepath · 7ba5400f
      xinhui pan authored
      We hit soft hang while doing memory pressure test on one numa system.
      After a qucik look, this is because kfd invalid/valid userptr memory
      frequently with process_info lock hold.
      Looks like update page table mapping use too much cpu time.
      perf top says below,
      75.81%  [kernel]       [k] __srcu_read_unlock
       6.19%  [amdgpu]       [k] amdgpu_gmc_set_pte_pde
       3.56%  [kernel]       [k] __srcu_read_lock
       2.20%  [amdgpu]       [k] amdgpu_vm_cpu_update
       2.20%  [kernel]       [k] __sg_page_iter_dma_next
       2.15%  [drm]          [k] drm_dev_enter
       1.70%  [drm]          [k] drm_prime_sg_to_dma_addr_array
       1.18%  [kernel]       [k] __sg_alloc_table_from_pages
       1.09%  [drm]          [k] drm_dev_exit
      So move drm_dev_enter/exit outside gmc code, instead let caller do it.
      They are gart_unbind, gart_map, vm_clear_bo, vm_update_pdes and
      gmc_init_pdb0. vm_bo_update_mapping already calls it.
      Signed-off-by: default avatarxinhui pan <xinhui.pan@amd.com>
      Reviewed-and-tested-by: Andrey G...
  2. 15 Sep, 2021 8 commits
  3. 14 Sep, 2021 16 commits
  4. 13 Sep, 2021 8 commits
  5. 10 Sep, 2021 5 commits