- May 22, 2023
-
-
Matthew Auld authored
drivers/gpu/drm/xe/xe_guc_submit_types.h:47: warning: cannot understand function prototype: 'struct guc_submit_parallel_scratch ' drivers/gpu/drm/xe/xe_devcoredump_types.h:38: warning: Function parameter or member 'ct' not described in 'xe_devcoredump_snapshot' CI doesn't appear to be running BAT anymore, assuming this is caused by the CI.Hooks now failing due to above warnings. Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Nirmoy Das <nirmoy.das@intel.com>
-
Matthew Auld authored
In mutex_init() lockdep identifies a lock by defining a special static key for each lock class. However if we wrap the macro in a function, like in drmm_mutex_init(), we end up generating: int drmm_mutex_init(struct drm_device *dev, struct mutex *lock) { static struct lock_class_key __key; __mutex_init((lock), "lock", &__key); .... } The static __key here is what lockdep uses to identify the lock class, however since this is just a normal function the key here will be created once, where all callers then use the same key. In effect the mutex->depmap.key will be the same pointer for different drmm_mutex_init() callers. This then results in impossible lockdep splats since lockdep thinks completely unrelated locks are the same lock class. To fix this turn drmm_mutex_init() into a macro such that it generates a different "static struct lock_class_key __key" for each invocation, which looks to be inline with what mutex_init() wants. v2: - Revamp the commit message with clearer explanation of the issue. - Rather export __drmm_mutex_release() than static inline. Reported-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reported-by:
Sarah Walker <sarah.walker@imgtec.com> Fixes: e13f13e0 ("drm: Add DRM-managed mutex_init()") Cc: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Jocelyn Falempe <jfalempe@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Reviewed-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com> Acked-by:
Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by:
Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230519090733.489019-1-matthew.auld@intel.com [mauld: Already merged upstream. To be dropped during xe-next rebase] (cherry picked from commit c21f11d1)
-
- May 19, 2023
-
-
mwajdecz authored
Both GUC_HOST_INTERRUPT and MED_GUC_HOST_INTERRUPT can pass additional payload data to the GuC but this capability is not used by the firmware yet. Stop using value mandated by legacy GuC interrupt register and use default notify value (zero) instead. Bspec: 49813, 63363 Signed-off-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by:
Matt Roper <matthew.d.roper@intel.com>
-
mwajdecz authored
This GuC register can be moved together with the rest of the GuC register definitions and be named in a similar way. v2: fix placement Bspec: 63363 Signed-off-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> #v1 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Jouni Högander authored
Now we have struct intel_frontbuffer pointer in intel_fb for Xe as well. Handle setting frontbuffer bit same way as done in i915. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Jouni Högander <jouni.hogander@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Jouni Högander authored
After reverting frontbuffer tracking removal our build is broken. Fix this by adding some includes and ifdefs. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Jouni Högander <jouni.hogander@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Jouni Högander authored
This reverts commit a3844366. We want to keep frontbuffer tracking as removing it would break GPU rendering in i915 driver. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Jouni Högander <jouni.hogander@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Jouni Högander authored
This reverts commit be61174d. We want to keep frontbuffer tracking as removing it would break GPU rendering in i915 driver. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Jouni Högander <jouni.hogander@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Rodrigo Vivi authored
There are multiple kind of config prints and with the upcoming devcoredump there will be another layer. Let's limit the config to the top level functions and leave the clean-up work for the compilers so we don't create a spider-web of configs. No functional change. Just a preparation for the devcoredump. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Let's continue to add our existent simple logs to devcoredump one by one. Any format change should come on follow-up work. v2: remove unnecessary, and now duplicated, dma_fence annotation. (Matthew) v3: avoid for_each with faulty_engine since that can be already freed at the time of the read/free. Instead, iterate in the full array of hw_engines. (Kasan) Cc: Francois Dugast <francois.dugast@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Francois Dugast <francois.dugast@intel.com>
-
Rodrigo Vivi authored
The goal is to allow for a snapshot capture to be taken at the time of the crash, while the print out can happen at a later time through the exposed devcoredump virtual device. v2: Addressing these Matthew comments: - Handle memory allocation failures. - Do not use GFP_ATOMIC on cases like debugfs prints. - placement of @Reg doc. - identation issues. v3: checkpatch v4: Rebase and get back to GFP_ATOMIC only. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Let's start to move our existent logs to devcoredump one by one. Any format change should come on follow-up work. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
The goal is to allow for a snapshot capture to be taken at the time of the crash, while the print out can happen at a later time through the exposed devcoredump virtual device. v2: Handle memory allocation failures. (Matthew) Do not use GFP_ATOMIC on cases like debugfs prints. (Matthew) v3: checkpatch v4: pending_list allocation needs to be atomic because of the spin_lock. (Matthew) get back to GFP_ATOMIC only. (lockdep). Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
These structs and definitions are only used for the guc_submit and they were added specifically for the parallel submission. While doing that also delete the unused struct guc_wq_item. v2: checkpatch fixes. Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Let's start to move our existent logs to devcoredump one by one. Any format change should come on follow-up work. v2: Rebase and add the dma_fence locking annotation here. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
The goal is to allow for a snapshot capture to be taken at the time of the crash, while the print out can happen at a later time through the exposed devcoredump virtual device. v2: Handle memory allocation failures. (Matthew) Do not use GFP_ATOMIC on cases like debugfs prints. (Matthew) v3: checkpatch fixes v4: Do not use atomic in the g2h_worker_func (Matthew) Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
No functional change here. The goal is to have a clear split between the mapped portions of the CTB and the static information, so we can easily capture snapshots that will be used for later read out with the devcoredump infrastructure. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
Unfortunately devcoredump infrastructure does not provide and interface for us to force the device removal upon the pci_remove time of our device. The devcoredump is linked at the device level, so when in use it will prevent the module removal, but it doesn't prevent the call of the pci_remove callback. This callback cannot fail anyway and we end up clearing and freeing the entire pci device. Hence, after we removed the pci device, we shouldn't allow any read or free operations to avoid segmentation fault. Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Rodrigo Vivi authored
The goal is to use devcoredump infrastructure to report error states captured at the crash time. The error state will contain useful information for GPU hang debug, such as INSTDONE registers and the current buffers getting executed, as well as any other information that helps user space and allow later replays of the error. The proposal here is to avoid a Xe only error_state like i915 and use a standard dev_coredump infrastructure to expose the error state. For our own case, the data is only useful if it is a snapshot of the time when the GPU crash has happened, since we reset the GPU immediately after and the registers might have changed. So the proposal here is to have an internal snapshot to be printed out later. Also, usually a subsequent GPU hang can be only a cause of the initial one. So we only save the 'first' hang. The dev_coredump has a delayed work queue where it remove the coredump and free all the data within a few moments of the error. When that happens we also reset our capture state and allow further snapshots. Right now this infra only print out the time of the hang. More information will be migrated here on subsequent work. Also, in order to organize the dump better, the goal is to propose dev_coredump changes itself to allow multiple files and different controls. But for now we start Xe usage of it without any dependency on dev_coredump core changes. v2: Add dma_fence annotation for capture that might happen during long running. (Thomas and Matt) Use xe->drm.primary->index on drm_info msg. (Jani) v3: checkpatch fixes v4: Fix building and locking issues found by Francois. Actually let's kill all of the locking in here. gt_reset serialization already guarantee that there will be only one capture at the same time. Also, the devcoredump has its own locking to protect the free and reads and drivers don't need to duplicate it. Besides this, the dma_fence locking was pushed to a following patch since it is not needed in this one. Fix a use after free identified by KASAN: Do not stash the faulty_engine since that will be freed somewhere else. v5: Fix Uptime - ktime_get_boottime actually returns the Uptime. (Francois) Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
mwajdecz authored
Replace generic log messages with ones dedicated for the GT. While around replace errno logs from plain %d to pretty %pe. v2: rebased v3: unify errno logs Signed-off-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matt Roper <matthew.d.roper@intel.com>
-
mwajdecz authored
While debugging GT related problems, it's good to know which GT was reporting problems. Introduce helper macros to allow prefix GT logs with GT identifier. We will use them in upcoming patches. v2: use xe_ prefix (Lucas) v3: use correct include Signed-off-by:
Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Matt Roper <matthew.d.roper@intel.com>
-
Lucas De Marchi authored
Reviewed-by:
Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230518212446.3570168-4-lucas.demarchi@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Lucas De Marchi authored
Reviewed-by:
Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230518230221.3571188-1-lucas.demarchi@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
Lucas De Marchi authored
Reviewed-by:
Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230518212446.3570168-2-lucas.demarchi@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-
- May 18, 2023
-
-
José Roberto de Souza authored
OpenGL stack makes scanout buffers exported as well because usually it will be exported from client application to compositor. Exported buffers needs to be placed in SMEM so it can be accessed by different GPUs if needed, so in discrete GPUs it is placed in SMEM+LMEM. But this combination is causing aplications like kmscube to not be able to create framebuffers. So here doing the same handling as i915 and allowing bos that can migrate to lmem0/vram0 to be promoted to framebuffer. At least all current discrete GPUs with display only have one lmem region, so we don't need to check for vram1 but we might need to extend it in future. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
José Roberto de Souza <jose.souza@intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com>
-
- May 17, 2023
-
-
Matthew Brost authored
This is allowed and encouraged by the dma-fencing rules. This along with allowing compute VMs to export dma-fences on binds will result in a simpler compute UMD. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Matthew Brost authored
Binds are not long running jobs thus we can export dma-fences even if a VM is in compute mode. Signed-off-by:
Matthew Brost <matthew.brost@intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com>
-
Jani Nikula authored
Use hotplug irq code from i915 display/intel_hotplug_irq.c instead of copy-paste. For now, need to add ilk_update_display_irq() and bdw_update_port_irq() to xe display/ext/i915_irq.c. Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Jani Nikula authored
Split hotplug irq handling out of i915_irq.[ch] into display/intel_hotplug_irq.[ch]. The line between the new intel_hotplug_irq.[ch] and the existing intel_hotplug.[ch] needs further clarification, but the first step is to move the stuff out of i915_irq.[ch]. Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by:
Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230515101738.2399816-2-jani.nikula@intel.com (cherry picked from commit da38ba98)
-
Jani Nikula authored
The return value is not used for anything. Reviewed-by:
Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230515101738.2399816-1-jani.nikula@intel.com (cherry picked from commit 08d8f430)
-
Jani Nikula authored
Move gmbus and dp aux irq handlers to their respective files. It should be up to them what to do with the irq, not the generic irq code. Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e825385fc03cb3d53c1f0b66712eea42dad69d59.1683219363.git.jani.nikula@intel.com (cherry picked from commit 685282a3)
-
Jani Nikula authored
The dsparb_lock may be coming back with [1] later, but adding it here is probably just a rebase leftover. Drop it. [1] https://patchwork.freedesktop.org/patch/msgid/20230327161223.406573-1-rodrigo.vivi@intel.com Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
- May 16, 2023
-
-
Gustavo Sousa authored
Move xe_register_pci_driver() and xe_unregister_pci_driver() to init_funcs to make sure that exit functions are also called when xe_register_pci_driver() fails. Note that this also allows adding init functions to be run after xe_register_pci_driver(). v2: - Move functions to init_funcs instead of having a special case for xe_register_pci_driver(). (Jani) Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Matt Atwood <matthew.s.atwood@intel.com> Signed-off-by:
Gustavo Sousa <gustavo.sousa@intel.com>
-
Gustavo Sousa authored
There is not much of a benefit from using that macro as of now and it hurts grepability or other ways of cross-referencing. Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Matt Atwood <matthew.s.atwood@intel.com> Signed-off-by:
Gustavo Sousa <gustavo.sousa@intel.com>
-
Thomas Hellström authored
drm/xe: Properly remove the vma from the vm::notifer::rebind_list when destroyed If a vma was destroyed with the bo evicted, it might happen that we forget to remove it from the notifer::rebind_list. Fix to make sure that really happens. v2: - Remove an unnecessarily verbose comment how to avoid taking a lock. (Matthew Brost) Cc: Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
Thomas Hellström authored
drm/xe: Fix unlocked access of the vma::rebind_link the vma::rebind_link is protected by the vm's resv, but we were modifying it without. Fix this by use the vma::userptr_link instead for the tmp_evict list. The vma::userptr_link is protected by the vm lock. Cc: Oded Gabbay <ogabbay@kernel.org> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Brost <matthew.brost@intel.com>
-
- May 15, 2023
-
-
Jani Nikula authored
CHV_FUSE_GT (0x182168) is purely about GT fuses, therefore belongs in intel_gt_regs.h, is in the gcfgmmio unit, but is technically in the VLV display base area. Add VLV_GUNIT_BASE to drop dependency on VLV_DISPLAY_BASE and thus display/intel_display_reg_defs.h in intel_gt_regs.h. v2: Add VLV_GUNIT_BASE (Ville) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230511152153.986676-1-jani.nikula@intel.com (cherry picked from commit 6e4e9fbd)
-
Jani Nikula authored
The dependency on display/intel_display_reg_defs.h is removed by upstream commit 6e4e9fbd ("drm/i915/gt: drop dependency on VLV_DISPLAY_BASE"). Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com>
-
Signed-off-by:
Francois Dugast <francois.dugast@intel.com> Reviewed-by:
Matt Atwood <matthew.s.atwood@intel.com>
-
- May 13, 2023
-
-
Lucas De Marchi authored
Fix race when pushing the display annotation on desc struct and enabling ADL-N. Link: https://lore.kernel.org/r/20230513050830.3240970-1-lucas.demarchi@intel.com Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com>
-