- Nov 01, 2024
-
-
Matthew Brost authored
GGTT mappings reside on the device and this state is lost during suspend / d3cold thus this state must be restored resume regardless if the BO is in system memory or VRAM. v2: - Unnecessary parentheses around bo->placements[0] (Checkpatch) Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031182257.2949579-1-matthew.brost@intel.com
-
Thomas Hellström authored
Multiple single-ioctl binds can be split up into multiple bind ioctls, reducing the memory required to hold the bind array. So rather than allowing the OOM killer to be invoked, return -ENOMEM or -ENOBUFS to user-space to take corrective action. v2: - Add __GFP_NOWARN to __GFP_RETRY_MAYFAIL allocations to avoid spamming the kernel log if a recoverable allocation fails. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: drm/xe/kernel#2701 Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031153732.164995-3-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
Rather than invoking the OOM killer on buffer object memory allocations and validations, have the allocations fail and pass the error to user-space if applicable. Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: drm/xe/kernel#2701 Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031153732.164995-2-thomas.hellstrom@linux.intel.com
-
Flush the g2h worker explicitly if TLB timeout happens which is observed on LNL and that points to the recent scheduling issue with E-cores on LNL. This is similar to the recent fix: commit e5152723 ("drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout") and should be removed once there is E core scheduling fix. v2: Add platform check(Himal) v3: Remove gfx platform check as the issue related to cpu platform(John) Use the common WA macro(John) and print when the flush resolves timeout(Matt B) v4: Remove the resolves log and do the flush before taking pending_lock(Matt A) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Link: drm/xe/kernel#2687 Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-3-nirmoy.das@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Flush xe ordered_wq in case of ufence timeout which is observed on LNL and that points to recent scheduling issue with E-cores. This is similar to the recent fix: commit e5152723 ("drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout") and should be removed once there is a E-core scheduling fix for LNL. v2: Add platform check(Himal) s/__flush_workqueue/flush_workqueue(Jani) v3: Remove gfx platform check as the issue related to cpu platform(John) v4: Use the Common macro(John) and print when the flush resolves timeout(Matt B) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Link: drm/xe/kernel#2754 Suggested-by:
Matthew Brost <matthew.brost@intel.com> Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-2-nirmoy.das@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Move LNL scheduling WA to xe_device.h so this can be used in other places without needing keep the same comment about removal of this WA in the future. The WA, which flushes work or workqueues, is now wrapped in macros and can be reused wherever needed. Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> cc: stable@vger.kernel.org # v6.11+ Suggested-by:
John Harrison <John.C.Harrison@Intel.com> Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-1-nirmoy.das@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Drop the exclusive client count tracking and use the filelist from the drm to track the active clients. This also ensures the clients created internally by the driver won't block changing the ccs mode. Fixes: ce8c161c ("drm/xe: Add ref counting for xe_file") Signed-off-by:
Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-3-balasubramani.vivekanandan@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
CCS_MODE register requires setting mask bits from Xe2+ platforms. Set the mask bits unconditionally, as those bits are unused for older platforms. Signed-off-by:
Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-2-balasubramani.vivekanandan@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Lucas De Marchi authored
needs_wa_1607983814() predates wa_oob, so it was not being printed in /sys/kernel/debug/dri/0/*/workarounds. Port it to OOB rules. This makes the WA show up in debugfs. For TGL: OOB Workarounds 1607983814 22012773006 1409600907 Eventually the RTP infra may add support for writing registers in a loop, which would allow to keep track of the registers as well. But for now, just listing it as OOB workaround is already an improvement. Reviewed-by:
Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029193258.749882-1-lucas.demarchi@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
- Oct 31, 2024
-
-
The pointer list->list is dereferenced before the NULL check. Fix this by moving the NULL check outside the for loop, so that the check is performed before the dereferencing. The list->list pointer cannot be NULL so this has no effect on runtime. It's just a correctness issue. This issue was reported by Coverity Scan. https://scan7.scan.coverity.com/#/project-view/51525/11354?selectedIssue=1600335 Fixes: 0f1fdf55 ("drm/xe/guc: Save manual engine capture into capture list") Signed-off-by:
Everest K.C. <everestkc@everestkc.com.np> Reviewed-by:
Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023233356.5479-1-everestkc@everestkc.com.np
-
Short circuiting TDR on jobs not started is an optimization which is not required. On LNL we are facing an issue where jobs do not get scheduled by the GuC if it misses a GGTT page update. When this occurs let the TDR fire, toggle the scheduling which may get the job unstuck, and print a warning message. If the TDR fires twice on job that hasn't started, timeout the job. v2: - Add warning message (Paulo) - Add fixes tag (Paulo) - Timeout job which hasn't started after TDR firing twice v3: - Include local change v4: - Short circuit check_timeout on job not started - use warn level rather than notice (Paulo) Fixes: 7ddb9403 ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Cc: stable@vger.kernel.org Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025214330.2010521-2-matthew.brost@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
On LNL without a mmio read before a GGTT invalidate the GuC can incorrectly read the GGTT scratch page upon next access leading to jobs not getting scheduled. A mmio read before a GGTT invalidate seems to fix this. Since a GGTT invalidate is not a hot code path, blindly do a mmio read before each GGTT invalidate. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by:
Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: drm/xe/kernel#3164 Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023221200.1797832-1-matthew.brost@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
- Oct 30, 2024
-
-
This reverts commit 55858fa7. '55858fa7 ("drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs")' was not properly reviewed and also causes dmesg asserts in CI. Revert it. Closes: drm/xe/kernel#3295 Fixes: 55858fa7 ("drm/xe/xe_guc_ads: save/restore OA registers and allowlist regs") Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029200147.1476513-1-ashutosh.dixit@intel.com
-
- Oct 29, 2024
-
-
John Harrison authored
The guc_info debugfs file is meant to be a quick view of the current software state of the GuC interface. Including the full CTB contents makes the file as a whole much less human readable and is not partiular useful in the general case. So don't pollute the info dump with the full buffers. Instead, move those into a separate debugfs entry that can be read when that information is actually required. Also, improve the human readability by adding a few extra blank lines to delimt the sections. v2: Hide the internal capture/print params from external callers that don't need to know (review feedback from Matthew Brost). Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241024002554.1983101-3-John.C.Harrison@Intel.com
-
John Harrison authored
The extra bits are not hugely useful because the GuC log only uses 32bit time stamps. But they exist so might as well provide them. Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241024002554.1983101-2-John.C.Harrison@Intel.com
-
- Oct 28, 2024
-
-
Several OA registers and allowlist registers were missing from the save/restore list for GuC and could be lost during an engine reset. Add them to the list. v2: - Fix commit message (Umesh) - Add missing closes (Ashutosh) v3: - Add missing fixes (Ashutosh) Closes: drm/xe/kernel#2249 Fixes: dd08ebf6 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Suggested-by:
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Suggested-by:
John Harrison <john.c.harrison@intel.com> Signed-off-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> CC: stable@vger.kernel.org # v6.11+ Reviewed-by:
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023200716.82624-1-jonathan.cavitt@intel.com
-
- Oct 23, 2024
-
-
Ashutosh Dixit authored
Whereas all properties can be specified during OA stream open, when the OA stream is reconfigured only the config_id and syncs can be specified. v2: Use separate function table for reconfig case (Jonathan) Change bool function args to enum (Matt B) v3: s/xe_oa_set_property_funcs/xe_oa_set_property_funcs_open/ (Jonathan) Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Suggested-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-8-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
In addition to stream open, add xe_sync support to the OA config ioctl, where it is even more useful. This allows e.g. Mesa to replay a workload repeatedly on the GPU, each time with a different OA configuration, while precisely controlling (at batch buffer granularity) the workload segment for which a particular OA configuration is active, without introducing stalls in the userspace pipeline. v2: Emit OA config even when config id is same as previous, to ensure consistent sync behavior (Jose) Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-7-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
No code changes, only code movement so that functions used during stream open can be reused for the stream reconfiguration ioctl (DRM_XE_OBSERVATION_IOCTL_CONFIG). Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-6-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
Introduce 'struct xe_oa_fence' which includes the dma_fence used to signal output fences in the xe_sync array. The fences are signaled asynchronously. When there are no output fences to signal, the OA configuration wait is synchronously re-introduced into the ioctl. v2: Don't wait in the work, use callback + delayed work (Matt B) Use a single, not a per-fence spinlock (Matt Brost) v3: Move ofence alloc before job submission (Matt) Assert, don't fail, from dma_fence_add_callback (Matt) Additional dma_fence_get for dma_fence_wait (Matt) Change dma_fence_wait to non-interruptible (Matt) v4: Introduce last_fence to prevent uaf if stream is closed with pending OA config jobs v5: Remove oa_fence_lock, move spinlock back into xe_oa_fence to prevent uaf Suggested-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-5-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
Add input fence dependencies which will make OA configuration wait till these dependencies are met (till input fences signal). v2: Change add_deps arg to xe_oa_submit_bb from bool to enum (Matt Brost) Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-4-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
Now that we have laid the groundwork, introduce OA sync properties in the uapi and parse the input xe_sync array as is done elsewhere in the driver. Also add DRM_XE_OA_CAPS_SYNCS bit in OA capabilities for userspace. v2: Fix and document DRM_XE_SYNC_TYPE_USER_FENCE for OA (Matt B) Add DRM_XE_OA_CAPS_SYNCS bit to OA capabilities (Jose) Acked-by:
José Roberto de Souza <jose.souza@intel.com> Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-3-ashutosh.dixit@intel.com
-
Ashutosh Dixit authored
When we introduce xe_syncs, we don't wait for internal OA programming batches to complete. That is, xe_syncs are signaled asynchronously. In anticipation for this, separate out batch submission from waiting for completion of those batches. v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B) v3: Retain init "int err = 0;" in xe_oa_submit_bb (Jose) Reviewed-by:
Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by:
Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-2-ashutosh.dixit@intel.com
-
Matthew Brost authored
GT ordered work queue can be used to free memory via resets and fence signaling thus we should allow this work queue to run during reclaim. Mark with GT ordered work queue with WQ_MEM_RECLAIM appropriately. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021175705.1584521-5-matthew.brost@intel.com
-
Matthew Brost authored
G2H work queue can be used to free memory thus we should allow this work queue to run during reclaim. Mark with G2H work queue with WQ_MEM_RECLAIM appropriately. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021175705.1584521-4-matthew.brost@intel.com
-
Matthew Brost authored
GGTT work queue is used to free memory thus we should allow this work queue to run during reclaim. Mark with GGTT work queue with WQ_MEM_RECLAIM appropriately. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021175705.1584521-3-matthew.brost@intel.com
-
Matthew Brost authored
Take ref to job's fence in arm rather than run job. This ref is owned by the drm scheduler so it makes sense to take the ref before handing over the job to the scheduler. Also removes an atomic from the run job path. Suggested-by:
Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021173512.1584248-1-matthew.brost@intel.com
-
Nirmoy Das authored
In case of parallel submissions multiple GuC id will point to the same exec queue and on GT reset such exec queues will get restarted multiple times which is not desirable. v2: don't use exec_queue_enabled() which could race, do the same for xe_guc_submit_stop (Matt B) Link: drm/xe/kernel#2295 Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022103555.731557-1-nirmoy.das@intel.com Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com>
-
- Oct 22, 2024
-
-
mwajdecz authored
While we can already view individual VF LMEM provisioning using lmem_quota debugfs attribute, we want to have unified way to show summary across all VFs, like we do for GGTT or GuC doorbells/IDs. Signed-off-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by:
Marcin Bernatowicz <marcin.bernatowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021201506.1771-1-michal.wajdeczko@intel.com
-
Zhanjun Dong authored
GuC based register capture is not supported by VF, thus prevent it running on VF. Signed-off-by:
Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022010116.342240-2-zhanjun.dong@intel.com Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
- Oct 21, 2024
-
-
The lite restore is a performance improvement feature which avoids unnecessary context switch (flush, save and restore) if the incoming context has a ContextID matching that of the outgoing context. The scheduling is done by the GuC firmware, so on the driver side it's just a matter of setting corresponding GUC_CTL_FEATURE flag. This is supposed to be enabled by default, thus the flag is set unconditionally. Signed-off-by:
Fei Yang <fei.yang@intel.com> Reviewed-by:
John Harrison <John.C.Harrison@Intel.com> Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017162710.942553-2-fei.yang@intel.com
-
- Oct 20, 2024
-
-
Matthew Brost authored
Good practice to use __counted_by in kernel coding for flexible arrays. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241018030039.1077842-1-matthew.brost@intel.com
-
- Oct 18, 2024
-
-
Nirmoy Das authored
This shouldn't happen but seen this while debugging ufence timeout issue time to time so log it to isolate this particular case. v2: s/XE_WARN_ON/drm_dbg(Maarten) Link: drm/xe/kernel#1630 Cc: Francois Dugast <francois.dugast@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-3-nirmoy.das@intel.com Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com>
-
Nirmoy Das authored
access_ok() only checks for addr overflow so also try to read the addr to catch invalid addr sent from userspace. Link: drm/xe/kernel#1630 Cc: Francois Dugast <francois.dugast@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-2-nirmoy.das@intel.com Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com>
-
Shuicheng Lin authored
In some cases, when the driver attempts to read an MMIO register, the hardware may return 0xFFFFFFFF. The current force wake path code treats this as a valid response, as it only checks the BIT. However, 0xFFFFFFFF should be considered an invalid value, indicating a potential issue. To address this, we should add a log entry to highlight this condition and return failure. The force wake failure log level is changed from notice to err to match the failure return value. v2 (Matt Brost): - set ret value (-EIO) to kick the error to upper layers v3 (Rodrigo): - add commit message for the log level promotion from notice to err v4: - update reviewed info Suggested-by:
Alex Zuo <alex.zuo@intel.com> Signed-off-by:
Shuicheng Lin <shuicheng.lin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by:
Badal Nilawar <badal.nilawar@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017221547.1564029-1-shuicheng.lin@intel.com Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
- Oct 17, 2024
-
-
As part of this WA, GuC will hold a forcewake for certain MMIO accesses outside the GT/media domains. Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by:
Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by:
Matt Roper <matthew.d.roper@intel.com> Signed-off-by:
John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015234428.2004825-1-vinay.belgaumkar@intel.com
-
In case if g2h worker doesn't get opportunity to within specified timeout delay then flush the g2h worker explicitly. v2: - Describe change in the comment and add TODO (Matt B/John H) - Add xe_gt_warn on fence done after G2H flush (John H) v3: - Updated the comment with root cause - Clean up xe_gt_warn message (John H) Closes: drm/xe/kernel#1620 Closes: drm/xe/kernel#2902 Signed-off-by:
Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by:
Matthew Brost <matthew.brost@intel.com> Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241017111410.2553784-2-badal.nilawar@intel.com
-
Himal Ghimiray authored
There is no need to return an error from xe_force_wake_put(), as a failure implicitly indicates that the domain failed to sleep. v3 - Move kernel-doc to this patch (Badal) v5 - change parameter to unsigned int in xe_force_wake_put() v6 - Remove unneccsary wrapping (Michal) - Remove non required header (Michal) - Mention timeout(Michal) v8 - Fix kernel-doc Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Signed-off-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by:
Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-27-himal.prasad.ghimiray@intel.com Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Himal Ghimiray authored
Add __must_check attribute for xe_force_wake_get(). Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Reviewed-by:
Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-26-himal.prasad.ghimiray@intel.com Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Himal Ghimiray authored
A failure in xe_force_wake_get() no longer increments the domain's refcount. Therefore, if xe_force_wake_get() fails during forcewake debugfs open, return an error. This ensures there are no valid file descriptors to close via forcewake debugfs, preventing refcount mismanagement. v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int from xe_force_wake_get() v6 - Use helper xe_force_wake_ref_has_domain() to determine the status of the call. Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by:
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by:
Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-25-himal.prasad.ghimiray@intel.com Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-