- Apr 19, 2022
-
-
Tales Aparecida authored
To make sure maintainers of amdgpu drivers are aware of any changes in their documentation, add its entry to MAINTAINERS. Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tales Aparecida authored
Add missing acronyms to the amdgppu glossary. Closes: drm/amd#1939 Acked-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tom Rix authored
evergreen_default_state and evergreen_default_size are only used in evergreen.c. Single file symbols should be static. So move their definitions to evergreen_blit_shaders.h and change their storage-class-specifier to static. Remove unneeded evergreen_blit_shader.c evergreen_ps/vs definitions were removed with commit 4f862967 ("drm/radeon/kms: remove r6xx+ blit copy routines") So their declarations in evergreen_blit_shader.h are not needed, so remove them. Signed-off-by:
Tom Rix <trix@redhat.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tom Rix authored
Smatch reports this issue virtual_link_hwss.c:32:6: warning: symbol 'virtual_setup_stream_attribute' was not declared. Should it be static? virtual_setup_stream_attribute is only used in virtual_link_hwss.c, but the other functions in the file are declared in the header file and used elsewhere. For consistency, add the virtual_setup_stream_attribute decl to virtual_link_hwss.h. Signed-off-by:
Tom Rix <trix@redhat.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Keita Suzuki authored
In function si_parse_power_table(), array adev->pm.dpm.ps and its member is allocated. If the allocation of each member fails, the array itself is freed and returned with an error code. However, the array is later freed again in si_dpm_fini() function which is called when the function returns an error. This leads to potential double free of the array adev->pm.dpm.ps, as well as leak of its array members, since the members are not freed in the allocation function and the array is not nulled when freed. In addition adev->pm.dpm.num_ps, which keeps track of the allocated array member, is not updated until the member allocation is successfully finished, this could also lead to either use after free, or uninitialized variable access in si_dpm_fini(). Fix this by postponing the free of the array until si_dpm_fini() and increment adev->pm.dpm.num_ps everytime the array member is allocated. Signed-off-by:
Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Tales Aparecida authored
It's a local function, let's make it static. AGD: remove prototype in dcn10_hubp.h Signed-off-by:
Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com>
-
Darren Powell authored
Clarify the smu_cmn_send_smc_msg_with_param documentation to mention two cases exist where messages are silently dropped with no error returned. These cases occur in unusual situations where either: 1. the message type is not allowed to a virtual GPU, or 2. a PCI recovery is underway and the HW is not yet in sync with the SW For more details see commit 4ea5081c ("drm/amd/powerplay: enable SMC message filter") commit bf36b52e ("drm/amdgpu: Avoid accessing HW when suspending SW state") (v2) Reworked with suggestions from Luben & Paul (v3) Updated wording as per Luben's feedback Corrected error stating all messages denied on virtual GPU (each GPU has mask of which messages are allowed) Signed-off-by:
Darren Powell <darren.powell@amd.com> Reviewed-by:
Luben Tuikov <luben.tuikov@amd.com>
-
- Apr 16, 2022
-
-
Huang Rui authored
It needs to check if the pp_funcs is initialized while release the context, otherwise it will trigger null pointer panic while the software smu is not enabled. [ 1109.404555] BUG: kernel NULL pointer dereference, address: 0000000000000078 [ 1109.404609] #PF: supervisor read access in kernel mode [ 1109.404638] #PF: error_code(0x0000) - not-present page [ 1109.404657] PGD 0 P4D 0 [ 1109.404672] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1109.404701] CPU: 7 PID: 9150 Comm: amdgpu_test Tainted: G OEL 5.16.0-custom #1 [ 1109.404732] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1109.404765] RIP: 0010:amdgpu_dpm_force_performance_level+0x1d/0x170 [amdgpu] [ 1109.405109] Code: 5d c3 44 8b a3 f0 80 00 00 eb e5 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 4c 8b b7 f0 7d 00 00 <49> 83 7e 78 00 0f 84 f2 00 00 00 80 bf 87 80 00 00 00 48 89 fb 0f [ 1109.405176] RSP: 0018:ffffaf3083ad7c20 EFLAGS: 00010282 [ 1109.405203] RAX: 0000000000000000 RBX: ffff9796b1c14600 RCX: 0000000002862007 [ 1109.405229] RDX: ffff97968591c8c0 RSI: 0000000000000001 RDI: ffff9796a3700000 [ 1109.405260] RBP: ffffaf3083ad7c50 R08: ffffffff9897de00 R09: ffff979688d9db60 [ 1109.405286] R10: 0000000000000000 R11: ffff979688d9db90 R12: 0000000000000001 [ 1109.405316] R13: ffff9796a3700000 R14: 0000000000000000 R15: ffff9796a3708fc0 [ 1109.405345] FS: 00007ff055cff180(0000) GS:ffff9796bfdc0000(0000) knlGS:0000000000000000 [ 1109.405378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1109.405400] CR2: 0000000000000078 CR3: 000000000a394000 CR4: 00000000000506e0 [ 1109.405434] Call Trace: [ 1109.405445] <TASK> [ 1109.405456] ? delete_object_full+0x1d/0x20 [ 1109.405480] amdgpu_ctx_set_stable_pstate+0x7c/0xa0 [amdgpu] [ 1109.405698] amdgpu_ctx_fini.part.0+0xcb/0x100 [amdgpu] [ 1109.405911] amdgpu_ctx_do_release+0x71/0x80 [amdgpu] [ 1109.406121] amdgpu_ctx_ioctl+0x52d/0x550 [amdgpu] [ 1109.406327] ? _raw_spin_unlock+0x1a/0x30 [ 1109.406354] ? drm_gem_handle_delete+0x81/0xb0 [drm] [ 1109.406400] ? amdgpu_ctx_get_entity+0x2c0/0x2c0 [amdgpu] [ 1109.406609] drm_ioctl_kernel+0xb6/0x140 [drm] Signed-off-by:
Huang Rui <ray.huang@amd.com> Reviewed-by:
Aaron Liu <aaron.liu@amd.com>
-
- Apr 15, 2022
-
-
Lang Yu authored
The idea is from commit a50fe707 ("drm/amdkfd: Only apply heavy-weight TLB flush on Aldebaran") and commit f61c40c0 ("drm/amdkfd: enable heavy-weight TLB flush on Arcturus"). At the moment, heavy-weight TLB could cause problems on ASICs except Aldebaran and Arcturus. A simple hipMallocManaged/hipFree program could trigger this issue. [ 97.787657] amdgpu 0000:01:00.0: amdgpu: wait for kiq fence error: 0. [ 106.868758] amdgpu: qcm fence wait loop timeout expired [ 106.868966] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 106.869203] amdgpu: Failed to evict process queues [ 106.869261] amdgpu: Failed to quiesce KFD Signed-off-by:
Lang Yu <Lang.Yu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com>
-
Lang Yu authored
To make kfd_flush_tlb_after_unmap visible in kfd_svm.c, move it into kfd_priv.h. And change it to an inline function. Signed-off-by:
Lang Yu <Lang.Yu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com>
-
- Apr 14, 2022
-
-
Gavin Wan authored
[why] These static variables saves the RLC Scratch registers address. When we installed multiple GPUs (for example: XGMI setting) and multiple GPUs call the function at same time. The RLC Scratch registers address are changed each other. Then it caused reading/writing to wrong GPU. [fix] Removed the static from the variables. The variables are in stack. Reviewed-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Gavin Wan <Gavin.Wan@amd.com> Change-Id: Iee78849291d4f7a9688ecc5165bec70ee85cdfbe
-
Felix Kuehling authored
Add the waiters to the wait queue during initialization, while holding the event spinlock. Otherwise the waiter will not get activated if the event signals before being added to the wait queue. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip <Yang<Philip.Yang@amd.com>
-
Rodrigo Siqueira authored
This reverts commit 367b3e93. While we were testing DCN3.1 with a hub, we noticed that only one of 2 connected displays lights up when using some specific display resolution. In summary, this was the setup: 1. Displays: * Sharp LQ156M1JW26 (eDP): 1080@240 * BENQ SW320 (DP): 4k@60 * BENQ EX3203R (DP): 4k@60 2. Hub: Club3D CSV-7300 3. ASIC: DCN3.1 After bisecting this issue, we figured out the commit mentioned above introduced this issue. We are investigating why this patch introduced this regression, but we need to revert it for now. Cc: Harry Wentland <harry.wentland@amd.com> Cc: Mark Broadworth <Mark.Broadworth@amd.com> Cc: Michael Strauss <michael.strauss@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
-
xinhui pan authored
VM might already be freed when amdgpu_vm_tlb_seq_cb() is called. We see the calltrace below. Fix it by keeping the last flush fence around and wait for it to signal BUG kmalloc-4k (Not tainted): Poison overwritten 0xffff9c88630414e8-0xffff9c88630414e8 @offset=5352. First byte 0x6c instead of 0x6b Allocated in amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] age=44 cpu=0 pid=2343 __slab_alloc.isra.0+0x4f/0x90 kmem_cache_alloc_trace+0x6b8/0x7a0 amdgpu_driver_open_kms+0x9d/0x360 [amdgpu] drm_file_alloc+0x222/0x3e0 [drm] drm_open+0x11d/0x410 [drm] Freed in amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] age=22 cpu=1 pid=2485 kfree+0x4a2/0x580 amdgpu_driver_postclose_kms+0x3e9/0x550 [amdgpu] drm_file_free+0x24e/0x3c0 [drm] drm_close_helper.isra.0+0x90/0xb0 [drm] drm_release+0x97/0x1a0 [drm] __fput+0xb6/0x280 ____fput+0xe/0x10 task_work_run+0x64/0xb0 Suggested-by:
Christian König <christian.koenig@amd.com> Signed-off-by:
xinhui pan <xinhui.pan@amd.com> Reviewed-by:
Christian König <christian.koenig@amd.com>
-
- Apr 13, 2022
-
-
If lookup_event_by_id() returns a NULL "ev" pointer then the spin_lock(&ev->lock) will crash. This was detected by Smatch: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_events.c:644 kfd_set_event() error: we previously assumed 'ev' could be null (see line 639) Fixes: 5273e82c ("drm/amdkfd: Improve concurrency of event handling") Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com>
-
- Apr 12, 2022
-
-
Update PCI revision id check for the new YC platform varaint. Signed-off-by:
Vijendar Mukunda <Vijendar.Mukunda@amd.com> Link: https://lore.kernel.org/r/20220411134119.1767646-1-Vijendar.Mukunda@amd.com Signed-off-by:
Mark Brown <broonie@kernel.org> (cherry picked from commit b1630fcb)
-
Mukul Joshi authored
Currently, the IO-links to the device being removed from topology, are not cleared. As a result, there would be dangling links left in the KFD topology. This patch aims to fix the following: 1. Cleanup all IO links to the device being removed. 2. Ensure that node numbering in sysfs and nodes proximity domain values are consistent after the device is removed: a. Adding a device and removing a GPU device are made mutually exclusive. b. The global proximity domain counter is no longer required to be an atomic counter. A normal 32-bit counter can be used instead. 3. Update generation_count to let user-mode know that topology has changed due to device removal. CC: Shuotao Xu <shuotaoxu@microsoft.com> Reviewed-by:
Shuotao Xu <shuotaoxu@microsoft.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by:
Mukul Joshi <mukul.joshi@amd.com>
-
Yongqiang Sun authored
MS_HYPERV with vega10 doesn't have the interface to process request init data msg. Check hypervisor type to not send the request for MS_HYPERV. Signed-off-by:
Yongqiang Sun <yongqiang.sun@amd.com> Reviewed-by:
Alice Wong <shiwei.wong@amd.com> Acked-by:
Alex Deucher <alexander.deucher@amd.com>
-
Lang Yu authored
A MAX_GPU_INSTANCE bits bitmap will suffice. Signed-off-by:
Lang Yu <Lang.Yu@amd.com> Reviewed-by:
Felix Kuehling <Felix.Kuehling@amd.com>
-
[why] Extract update stream allocation table into link hwss as part of the link hwss refactor work. Reviewed-by:
George Shen <George.Shen@amd.com> Reviewed-by:
Fangzhi Zuo <Jerry.Zuo@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Wenjing Liu <wenjing.liu@amd.com>
-
[why] creating a generic helper for AMD specific PSR-SU sink validation. Moving the function to the power module to reference it across all OS. [how] - drop PSRSU specific sink validation helper and move to power module by reading PSR version and other PSR caps - call the new helper from linux DM (amdgpu_dm_psr) Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Acked-by:
Tom Chung <chiahsuan.chung@amd.com> Signed-off-by:
David Zhang <dingchen.zhang@amd.com>
-
Title: DC Patches Apri 6, 2022 This DC patchset brings improvements in multiple areas. In summary, we highlight: *Disabling Z10 on DCN31 *Fix issue breaking 32bit Linux build *Fix inconsistent timestamp type *Add DCN30 support FEC init *Fix crash on setting VRR with no display connected *Disable FEC if DSC not supported for EDP *Add odm seamless boot support *Select correct DTO source *Power down hardware if timer not trigger Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Aric Cyr <aric.cyr@amd.com>
-
[WHY&HOW] Change criteria for setting DTO source value, and always set it regardless of the signal type. Reviewed-by:
Ariel Bernstein <Eric.Bernstein@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Dillon Varone <dillon.varone@amd.com>
-
[Why] Currently, the 32bit linux build is failing due to an issue with using the built-in / operator with a 64bit dividend. Doing so generates code which calls __udivdi3() in libgcc. However, libgcc is not linked with the kernel at this point in the build, hence this causes the 32bit build to fail to compile. [How] Change the / operator to div_u64 instead. Reviewed-by:
Aric Cyr <Aric.Cyr@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Hayden Goodfellow <Hayden.Goodfellow@amd.com>
-
Felix Kuehling authored
The synchronize_rcu call in destroy_events can take several ms, which noticeably slows down applications destroying many events. Use kfree_rcu to free the event structure asynchronously and eliminate the synchronize_rcu call in the user thread. Signed-off-by:
Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by:
Philip Yang <Philip.Yang@amd.com>
-
- Apr 11, 2022
-
-
hersen wu authored
[Why] within dc link detecion, dp link training will be executed for external sst dp. for debug purpose, we may need skip dp link training. [How] expose dc debug option to skip_detection_link_training to debugfs Reviewed-by:
Roman Li <Roman.Li@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
hersen wu <hersenxs.wu@amd.com>
-
Dillon Varone authored
Reviewed-by:
Ariel Bernstein <Eric.Bernstein@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Dillon Varone <dillon.varone@amd.com>
-
Angus Wang authored
[WHY] An unsigned int timestamp variable is assigned with an unsigned long long value. Also, the assignment directly converts the tick value to us without using built-in get elapsed time function. [HOW] Cast the assigned value correctly and also use built-in function to get the timestamp in the unit we want. Reviewed-by:
Aric Cyr <Aric.Cyr@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Angus Wang <Angus.Wang@amd.com>
-
Jingwen Zhu authored
[Why] FEC init used on DCN30. [How] Check fec active when HW init. Co-authored-by:
Jingwen Zhu <Jingwen.Zhu@github.amd.com> Reviewed-by:
Wenjing Liu <Wenjing.Liu@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Jingwen Zhu <Jingwen.Zhu@github.amd.com>
-
Mario Limonciello authored
When `osc_pc_lpi_support_confirmed` is set through `_OSC` and `_LPI` is populated then the cpuidle driver assumes that LPI is fully functional. However currently the kernel only provides architectural support for LPI on ARM. This leads to high power consumption on X86 platforms that otherwise try to enable LPI. So probe whether or not LPI support is implemented before enabling LPI in the kernel. This is done by overloading `acpi_processor_ffh_lpi_probe` to check whether it returns `-EOPNOTSUPP`. It also means that all future implementations of `acpi_processor_ffh_lpi_probe` will need to follow these semantics as well. Reviewed-by:
Sudeep Holla <sudeep.holla@arm.com> Signed-off-by:
Mario Limonciello <mario.limonciello@amd.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> (cherry picked from commit eb087f30)
-
On newer AMD platforms with SFH, it is observed that random interrupts get generated on the SFH hardware and until this is cleared the firmware sensor processing is stalled, resulting in no data been received to driver side. Add routines to handle these interrupts, so that firmware operations are not stalled. Signed-off-by:
Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 7f016b35)
-
Newer AMD platforms with SFH may generate interrupts on some events which are unwarranted. Until this is cleared the actual MP2 data processing maybe stalled in some cases. Add a mechanism to clear the pending interrupts (if any) during the driver initialization and sensor command operations. Signed-off-by:
Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit fb75a379)
-
Sensor data is processed in polling mode. Hence disable the interrupt for all sensor command. Signed-off-by:
Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit b300667b)
-
Misinterpreted intr_enable field name. Hence correct the structure field name accordingly to reflect the functionality. Fixes: f264481a ("HID: amd_sfh: Extend driver capabilities for multi-generation support") Signed-off-by:
Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit aa0b724a)
-
Since in the current amd_sfh design the sensor data is periodically obtained in the form of poll data, during the suspend/resume cycle, scheduling a delayed work adds no value. So, cancel the work and restart back during the suspend/resume cycle respectively. Signed-off-by:
Basavaraj Natikar <Basavaraj.Natikar@amd.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 0cf74235)
-
Duncan Ma authored
[WHY] Implement changes to transition from Pre-OS odm to Post-OS odm support. Seamless boot case is also considered. [HOW] Revised validation logic when marking for seamless boot. Init resources accordingly when Pre-OS has odm enabled. Reset odm and det size when transitioning Pre-OS odm to Post-OS non-odm to avoid corruption. Apply logic to set odm accordingly upon commit. Reviewed-by:
Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
"Duncan Ma" <duncan.ma@amd.com>
-
Oliver Logush authored
[why] Need to update the update_clock sequence to a fully tested sequence for dcn30 [how] Removed the check to see if clock is lowered Reviewed-by:
Charlene Liu <Charlene.Liu@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Oliver Logush <oliver.logush@amd.com>
-
Oliver Logush authored
[why] Make sure smu is not busy before sending another request, this is to prevent stress failures from MS. [how] Check to make sure the SMU fw busy signal is cleared before sending another request Reviewed-by:
Charlene Liu <Charlene.Liu@amd.com> Reviewed-by:
Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Oliver Logush <oliver.logush@amd.com>
-
Paul Hsieh authored
[WHY] In headless systems, if SetMode/Power down timer is not called, hardware will not be powered down causing HW/SW discrepancies. Powering down hardware on SetPowerState to D3 will ensure SW/HW state is accurate. [HOW] 1. If PowerDownThread timer is not trigger but OS call SetPowerState to D3, power down hardware. 2. Update HDMI hang w/a to apply to all TMDS signals on headless system Reviewed-by:
Martin Leung <Martin.Leung@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Paul Hsieh <paul.hsieh@amd.com>
-
Charlene Liu authored
[why] dcn316's dtbclk is from non_ss clock source. no compensation required here. Reviewed-by:
Chris Park <Chris.Park@amd.com> Acked-by:
Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by:
Charlene Liu <Charlene.Liu@amd.com>
-