- Mar 05, 2025
-
-
jokim-amd authored
To reset hung SDMA queues on GFX 9.4+ for the GFX9 family, a soft reset must be issued through SMU. Since soft resets will reset an entire SDMA engine, use a common KGD call to do the reset as the KGD will handle avoiding a reset of in flight GFX and paging queues on that engine. In addition, create a common call for all reset types to simplify the handling of module parameter settings that block gpu resets. Signed-off-by:
Jonathan Kim <jonathan.kim@amd.com> Reviewed-by:
Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Victor Lu authored
SRIOV VF does not have write access to AGP BAR regs. Skip the writes to avoid a dmesg warning. Signed-off-by:
Victor Lu <victorchengchi.lu@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Sathishkumar S authored
For cores 1 through 7 repair the core reset sequence by adjusting offsets to access the expected registers. Signed-off-by:
Sathishkumar S <sathishkumar.sundararaju@amd.com> Reviewed-by:
Leo Liu <leo.liu@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tony Yi authored
Add support for CPERs on VFs. VFs do not receive PMFW messages directly; as such, they need to query them from the host. To avoid hitting host event guard, CPER queries need to be rate limited. CPER queries share the same RAS telemetry buffer as error count query, so a mutex protecting the shared buffer was added as well. For readability, the amdgpu_detect_virtualization was refactored into multiple individual functions. Signed-off-by:
Tony Yi <Tony.Yi@amd.com> Reviewed-by:
Tao Zhou <tao.zhou1@amd.com> Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
James Zhu authored
before move to GTT domain. Signed-off-by:
James Zhu <James.Zhu@amd.com> Reviewed-by:
Felix Kuehling <felix.kuehling@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Aurabindo Pillai authored
drm_* macros are more helpful that DRM_* macros since the former indicates the associated DRM device that prints the error, which maybe helpful when debugging. Signed-off-by:
Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Aurabindo Pillai authored
Implement w/a for a panel which requires 10s delay after link detect. Signed-off-by:
Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tony Yi authored
Update amdgv_sriovmsg.h and mxgpu_nv.h to add new definitions for CPER support on VFs. PMFW ACA messages are not available on VFs, and VFs must query CPERs from host. Signed-off-by:
Tony Yi <Tony.Yi@amd.com> Reviewed-by:
Tao Zhou <tao.zhou1@amd.com> Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Kenneth Feng authored
always allow ih interrupt from fw on smu v14 based on the interface requirement Signed-off-by:
Kenneth Feng <kenneth.feng@amd.com> Reviewed-by:
Yang Wang <kevinyang.wang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
lijo lazar authored
After a full device reset, shared memory region will clear out and it's not possible to reliably save the region in case of RAS errors. Reinitialize the flags if required. Signed-off-by:
Lijo Lazar <lijo.lazar@amd.com> Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
lijo lazar authored
VCN IP versions >= 5.0 uses VCN5 fw shared struct. Signed-off-by:
Lijo Lazar <lijo.lazar@amd.com> Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Andrew Martin authored
Through KFD IOCTL Fuzzing we encountered a NULL pointer derefrence when calling kfd_queue_acquire_buffers. Fixes: 629568d2 ("drm/amdkfd: Validate queue cwsr area and eop buffer size") Signed-off-by:
Andrew Martin <Andrew.Martin@amd.com> Reviewed-by:
Philip Yang <Philip.Yang@amd.com> Signed-off-by:
Andrew Martin <Andrew.Martin@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alexandre Demers authored
DCE6 was missing soft reset, but it was easily identifiable under radeon. This should be it, pretty much as it is done under DCE8 and DCE10. Signed-off-by:
Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alexandre Demers authored
Whitespace cleanups. Signed-off-by:
Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Alexandre Demers authored
Add some comments. Signed-off-by:
Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Asad Kamal authored
Fix indentation issue for smu_v_13_0_12 get_gpu_metrics Reported-by:
kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502272246.OISqUnC1-lkp@intel.com Signed-off-by:
Asad Kamal <asad.kamal@amd.com> Reviewed-by:
Lijo Lazar <lijo.lazar@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Asad Kamal authored
For vcn_v_5_0_1, set power state to gating during hw fini. Also there may be scenario where VCN engine hangs during a job execution, then it's not safe to assume that set_pg_state works fine during hw_fini to put the state to gated. After a reset, we can assume that it's in the default state, therefore reset the driver maintained state. Put the default state as gated during reset as per this assumption. Signed-off-by:
Asad Kamal <asad.kamal@amd.com> Suggested-by:
Lijo Lazar <lijo.lazar@amd.com> Reviewed-by:
Lijo Lazar <lijo.lazar@amd.com> Reviewed-by:
Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
pqm_get_kernel_queue() has been unused since 2022's commit 5bdd3eb2 ("drm/amdkfd: Remove unused old debugger implementation") Remove it. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
print__rq_dlg_params_st() was added in 2017 by commit 061bfa06 ("drm/amdgpu/display: Add dml support for DCN") but has remained unused. Remove it. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
pre_surface_trace() has been unused since 2017's commit 745cc746 ("drm/amd/display: remove dc_pre_update_surfaces_to_stream from dc use") Remove it. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
With phm_powerdown_uvd() gone in the previous patch, there's now no longer anything that reads the powerdown_uvd member of the pp_hwmgr_func. Remove it. There are a few assignments to it; a boring NULL which can just go, and two functions, but those functions are called explicitly anyway so the assignments to the member go. One of those (smu7_powerdown_uvd) wasn't static previously; make it static. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
phm_powerdown_uvd() has been unused since 2017's commit 47047263 ("drm/amd/powerplay: delete eventmgr related files.") Remove it. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Dr. David Alan Gilbert authored
pp_atomfwctrl_get_pp_assign_pin() and pp_atomfwctrl_get_pp_assign_pin() were added in 2017 by commit 0d2c7569 ("drm/amdgpu: add new atomfirmware based helpers for powerplay") but have remained unused. Remove them, and the helper functions they used. Signed-off-by:
Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Richard István Thier authored
num_gb_pipes was set to a wrong value using r420_pipe_config This have lead to HyperZ glitches on fast Z clearing. Closes: https://bugs.freedesktop.org/show_bug.cgi?id=110897 Reviewed-by:
Marek Olšák <marek.olsak@amd.com> Signed-off-by:
Richard Thier <u9vata@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Upgrading the kernel may cause some systems that were previously not using a firmware specified brightness curve to use one. In the event of problems with this curve (for example an interpolation error) add a new dcdebugmask value that can be used to turn it off. Also add an info message to show that custom brightness curves are currently in use. Reviewed-by:
Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250228185145.186319-6-mario.limonciello@amd.com Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Some systems specify in the firmware a brightness curve that better reflects the characteristics of the panel used. This is done in the form of data points and matching luminance percentage. When converting a userspace requested brightness value use that curve to convert to a firmware intended brightness value. Reviewed-by:
Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250228185145.186319-5-mario.limonciello@amd.com Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Making a copy of the backlight caps structure between uses is unnecessary. Refer to pointers to the same structure when using it. Reviewed-by:
Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250228185145.186319-4-mario.limonciello@amd.com Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
The ATIF method on some systems will provide a backlight curve. Pass this curve into amdgpu_dm add it to the structures. Reviewed-by:
Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250228185145.186319-3-mario.limonciello@amd.com Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
As new members are introduced to the structure copying the entire structure will help avoid missing them. Reviewed-by:
Alex Hung <alex.hung@amd.com> Link: https://lore.kernel.org/r/20250228185145.186319-2-mario.limonciello@amd.com Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Taimur Hassan authored
This version brings along following fixes: - Various cleanups to amdgpu dm - Add DP tunneling IRQ handler - Fix display corruption for dcn35 - Fix dmcub reset problem - Adjust BW determination for PCON - DIO encoder refactor - Fix performance with SubVP under gaming Acked-by:
Tom Chung <chiahsuan.chung@amd.com> Signed-off-by:
Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
drm_err() will show which device has the error. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Scoped guards will release the mutex when they go out of scope. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
By using a _free() macro multiple duplicated snippets of code to free the sink can be dropped. The sink will be released when leaving scope. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
A scoped guard will release the mutex when it goes out of scope. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Using a _free(kfree) macro drops the need for a goto statement as it will be freed when it goes out of scope. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
amdgpu_dm_irq_resume_early() and amdgpu_dm_irq_resume_late() don't have any error flows. Change the return type from integer to void. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
drm_dbg() is helpful to show which device had the debug statement. Adjust to using this instead for debug messages. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
Scoped guards will release the mutex when they go out of scope. Adjust the code to use these instead. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
drm_err() is helpful to show which device had the error. Adjust to using this instead for error messages. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Mario Limonciello authored
All cases except a failure to create a copy of the current context will call dc_state_release() on the copied context. Use a _free() macro to free the context and then adjust the error handling flow to drop the unnecessary use of goto statements. Reviewed-by:
Alex Hung <alex.hung@amd.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Wayne Lin <wayne.lin@amd.com> Tested-by:
Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-