- Jul 26, 2022
-
-
In vma destruction, the following race may occur: Thread 1: Thread 2: i915_vma_destroy(); ... list_del_init(vma->vm_link); ... mutex_unlock(vma->vm->mutex); __i915_vm_release(); release_references(); And in release_reference() we dereference vma->vm to get to the vm gt pointer, leading to a use-after free. However, __i915_vm_release() grabs the vm->mutex so the vm won't be destroyed before vma->vm->mutex is released, so extract the gt pointer under the vm->mutex to avoid the vma->vm dereference in release_references(). v2: Fix a typo in the commit message (Andi Shyti) Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944 Fixes: e1a7ab4f ("drm/i915: Remove the vm open count") Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by:
Nirmoy Das <nirmoy.das@intel.con> Reviewed-by:
Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
-
If the move or clear operation somehow fails, and the memory underneath is not cleared, like when moving to lmem, then we currently fallback to memcpy or memset. However with small-BAR systems this fallback might no longer be possible. For now we use the set_wedged sledgehammer if we ever encounter such a scenario, and mark the object as borked to plug any holes where access to the memory underneath can happen. Add some basic selftests to exercise this. v2: - In the selftests make sure we grab the runtime pm around the reset. Also make sure we grab the reset lock before checking if the device is wedged, since the wedge might still be in-progress and hence the bit might not be set yet. - Don't wedge or put the object into an unknown state, if the request construction fails (or similar). Just returning an error and skipping the fallback should be safe here. - Make sure we wedge each gt. (Thomas) - Peek at the unknown_state in io_reserve, that way we don't have to export or hand roll the fault_wait_for_idle. (Thomas) - Add the missing read-side barriers for the unknown_state. (Thomas) - Some kernel-doc fixes. (Thomas) v3: - Tweak the ordering of the set_wedged, also add FIXME. Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220629174350.384910-11-matthew.auld@intel.com
-
- Jun 22, 2022
-
-
In the future display might try call this with a normal smem object, which doesn't require PIN_MAPPABLE underneath in order to CPU map the pages (unlike stolen). Extend i915_vma_pin_iomap() to directly use i915_gem_object_pin_map() for such cases. This change was suggested by Chris P Wilson, that we pin the smem with i915_gem_object_pin_map_unlocked(). v2 (jheikkil): Change i915_gem_object_pin_map_unlocked to i915_gem_object_pin_map Signed-off-by:
CQ Tang <cq.tang@intel.com> Signed-off-by:
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Jari Tahvanainen <jari.tahvanainen@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com> [mauld: tweak commit message, plus minor checkpatch fix] Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-2-juhapekka.heikkila@gmail.com
-
Don't leak lmem mapping in vma_evict, move __i915_vma_iounmap outside i915_vma_is_map_and_fenceable. Signed-off-by:
Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220610121205.29645-3-juhapekka.heikkila@gmail.com
-
- Jun 13, 2022
-
-
_i915_vma_move_to_active() can receive > 1 fences for multiple batch buffers submission. Because dma_resv_add_fence() can only accept one fence at a time, change _i915_vma_move_to_active() to be aware of multiple fences so that it can add individual fences to the dma resv object. v6: fix multi-line comment. v5: remove double fence reservation for batch VMAs. v4: Reserve fences for composite_fence on multi-batch contexts and also reserve fence slots to composite_fence for each VMAs. v3: dma_resv_reserve_fences is not cumulative so pass num_fences. v2: make sure to reserve enough fence slots before adding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5614 Fixes: 544460c3 ("drm/i915: Multi-BB execbuf") Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Reviewed-by:
Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220525095955.15371-1-nirmoy.das@intel.com (cherry picked from commit 420a07b8) Signed-off-by:
Jani Nikula <jani.nikula@intel.com>
-
- May 27, 2022
-
-
_i915_vma_move_to_active() can receive > 1 fences for multiple batch buffers submission. Because dma_resv_add_fence() can only accept one fence at a time, change _i915_vma_move_to_active() to be aware of multiple fences so that it can add individual fences to the dma resv object. v6: fix multi-line comment. v5: remove double fence reservation for batch VMAs. v4: Reserve fences for composite_fence on multi-batch contexts and also reserve fence slots to composite_fence for each VMAs. v3: dma_resv_reserve_fences is not cumulative so pass num_fences. v2: make sure to reserve enough fence slots before adding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5614 Fixes: 544460c3 ("drm/i915: Multi-BB execbuf") Cc: <stable@vger.kernel.org> # v5.16+ Signed-off-by:
Nirmoy Das <nirmoy.das@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Reviewed-by:
Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220525095955.15371-1-nirmoy.das@intel.com
-
- May 09, 2022
-
-
i915_vma_reopen checked if the vma is closed before without taking the lock. So multiple threads could attempt removing the vma. Instead the lock needs to be taken before actually checking. v2: move struct declaration Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.3+ Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5732 Signed-off-by:
Karol Herbst <kherbst@redhat.com> Fixes: 155ab883 ("drm/i915: Move object close under its own lock") Reviewed-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220420095720.3331609-1-kherbst@redhat.com (cherry picked from commit 1df1c79c) Signed-off-by:
Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
-
- May 04, 2022
-
-
Karol Herbst authored
i915_vma_reopen checked if the vma is closed before without taking the lock. So multiple threads could attempt removing the vma. Instead the lock needs to be taken before actually checking. v2: move struct declaration Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.3+ Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5732 Signed-off-by:
Karol Herbst <kherbst@redhat.com> Fixes: 155ab883 ("drm/i915: Move object close under its own lock") Reviewed-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220420095720.3331609-1-kherbst@redhat.com
-
Kefeng Wang authored
Use IOMEM_ERR_PTR() instead of self defined IO_ERR_PTR(). Signed-off-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by:
Jani Nikula <jani.nikula@intel.com> Signed-off-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220503144937.679424-1-tvrtko.ursulin@linux.intel.com
-
- May 03, 2022
-
-
Tvrtko Ursulin authored
Use lockdep_assert_not_held to simplify and correct the code. Otherwise false positive are hit if lock state is uknown like after a previous taint. Signed-off-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by:
Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220429140757.651406-1-tvrtko.ursulin@linux.intel.com
-
- Apr 07, 2022
-
-
Christian König authored
That should now be handled by the common dma_resv framework. Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-13-christian.koenig@amd.com
-
Christian König authored
Instead of distingting between shared and exclusive fences specify the fence usage while adding fences. Rework all drivers to use this interface instead and deprecate the old one. v2: some kerneldoc comments suggested by Daniel v3: fix a missing case in radeon v4: rebase on nouveau changes, fix lockdep and temporary disable warning v5: more documentation updates v6: separate internal dma_resv changes from this patch, avoids to disable warning temporary, rebase on upstream changes v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
-
- Apr 06, 2022
-
-
Christian König authored
Audit all the users of dma_resv_add_excl_fence() and make sure they reserve a shared slot also when only trying to add an exclusive fence. This is the next step towards handling the exclusive fence like a shared one. v2: fix missed case in amdgpu v3: and two more radeon, rename function v4: add one more case to TTM, fix i915 after rebase Signed-off-by:
Christian König <christian.koenig@amd.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com
-
- Mar 07, 2022
-
-
Matthew Auld authored
This is no longer possible since e6e1a304 ("drm/i915: vma is always backed by an object."). Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304174252.1000238-1-matthew.auld@intel.com
-
Matthew Auld authored
If the vm doesn't request async binding, like for example with the dpt, then we should be able to skip the async path and avoid calling i915_vm_lock_objects() altogether. Currently if we have a moving fence set for the BO(even though it might have signalled), we still take the async patch regardless of the bind_async setting, and then later still end up just doing i915_gem_object_wait_moving_fence() anyway. Alternatively we would need to add dummy scratch object which can be locked, just for the dpt. Suggested-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304095934.925036-2-matthew.auld@intel.com
-
Thomas Hellström authored
Now that i915_vma_parked() is taking the object lock on vma destruction, and the only user of the vma refcount, i915_gem_object_unbind() also takes the object lock, remove the vma refcount. v3: Documentation update. Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-3-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
vms are not getting properly closed. Rather than fixing that, Remove the vm open count and instead rely on the vm refcount. The vm open count existed solely to break the strong references the vmas had on the vms. Now instead make those references weak and ensure vmas are destroyed when the vm is destroyed. Unfortunately if the vm destructor and the object destructor both wants to destroy a vma, that may lead to a race in that the vm destructor just unbinds the vma and leaves the actual vma destruction to the object destructor. However in order for the object destructor to ensure the vma is unbound it needs to grab the vm mutex. In order to keep the vm mutex alive until the object destructor is done with it, somewhat hackishly grab a vm_resv refcount that is released late in the vma destruction process, when the vm mutex is no longer needed. v2: Address review-comments from Niranjana - Clarify that the struct i915_address_space::skip_pte_rewrite is a hack and should ideally be replaced in an upcoming patch. - Remove an unneeded continue in clear_vm_list and update comment. v3: - Documentation update - Commit message formatting Co-developed-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
-
- Feb 28, 2022
-
-
Thomas Hellström authored
It's unclear what reference the initial vma kref reference refers to. A vma can have multiple weak references, the object vma list, the vm's bound list and the GT's closed_list, and the initial vma reference can be put from lookups of all these lists. With the current implementation this means that any holder of yet another vma refcount (currently only i915_gem_object_unbind()) needs to be holding two of either *) An object refcount, *) A vm open count *) A vma open count in order for us to not risk leaking a reference by having the initial vma reference being put twice. Address this by re-introducing i915_vma_destroy() which removes all weak references of the vma and *then* puts the initial vma refcount. This makes a strong vma reference hold on to the vma unconditionally. Perhaps a better name would be i915_vma_revoke() or i915_vma_zombify(), since other callers may still hold a refcount, but with the prospect of being able to replace the vma refcount with the object lock in the near future, let's stick with i915_vma_destroy(). Finally this commit fixes a race in that previously i915_vma_release() and now i915_vma_destroy() could destroy a vma without taking the vm->mutex after an advisory check that the vma mm_node was not allocated. This would race with the ungrab_vma() function creating a trace similar to the below one. This was fixed in one of the __i915_vma_put() callsites in commit bc1922e5 ("drm/i915: Fix a race between vma / object destruction and unbinding") but although not seemingly triggered by CI, that is not sufficient. This patch is needed to fix that properly. [823.012188] Console: switching to colour dummy device 80x25 [823.012422] [IGT] gem_ppgtt: executing [823.016667] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0 [852.436465] stack segment: 0000 [#1] PREEMPT SMP NOPTI [852.436480] CPU: 0 PID: 3200 Comm: gem_ppgtt Not tainted 5.16.0-CI-CI_DRM_11115+ #1 [852.436489] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.2422.A00.2110131104 10/13/2021 [852.436499] RIP: 0010:ungrab_vma+0x9/0x80 [i915] [852.436711] Code: ef e8 4b 85 cf e0 e8 36 a3 d6 e0 8b 83 f8 9c 00 00 85 c0 75 e1 5b 5d 41 5c 41 5d c3 e9 d6 fd 14 00 55 53 48 8b af c0 00 00 00 <8b> 45 00 85 c0 75 03 5b 5d c3 48 8b 85 a0 02 00 00 48 89 fb 48 8b [852.436727] RSP: 0018:ffffc90006db7880 EFLAGS: 00010246 [852.436734] RAX: 0000000000000000 RBX: ffffc90006db7598 RCX: 0000000000000000 [852.436742] RDX: ffff88815349e898 RSI: ffff88815349e858 RDI: ffff88810a284140 [852.436748] RBP: 6b6b6b6b6b6b6b6b R08: ffff88815349e898 R09: ffff88815349e8e8 [852.436754] R10: 0000000000000001 R11: 0000000051ef1141 R12: ffff88810a284140 [852.436762] R13: 0000000000000000 R14: ffff88815349e868 R15: ffff88810a284458 [852.436770] FS: 00007f5c04b04e40(0000) GS:ffff88849f000000(0000) knlGS:0000000000000000 [852.436781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [852.436788] CR2: 00007f5c04b38fe0 CR3: 000000010a6e8001 CR4: 0000000000770ef0 [852.436797] PKRU: 55555554 [852.436801] Call Trace: [852.436806] <TASK> [852.436811] i915_gem_evict_for_node+0x33c/0x3c0 [i915] [852.437014] i915_gem_gtt_reserve+0x106/0x130 [i915] [852.437211] i915_vma_pin_ww+0x8f4/0xb60 [i915] [852.437412] eb_validate_vmas+0x688/0x860 [i915] [852.437596] i915_gem_do_execbuffer+0xc0e/0x25b0 [i915] [852.437770] ? deactivate_slab+0x5f2/0x7d0 [852.437778] ? _raw_spin_unlock_irqrestore+0x50/0x60 [852.437789] ? i915_gem_execbuffer2_ioctl+0xc6/0x2c0 [i915] [852.437944] ? init_object+0x49/0x80 [852.437950] ? __lock_acquire+0x5e6/0x2580 [852.437963] i915_gem_execbuffer2_ioctl+0x116/0x2c0 [i915] [852.438129] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915] [852.438300] drm_ioctl_kernel+0xac/0x140 [852.438310] drm_ioctl+0x201/0x3d0 [852.438316] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915] [852.438490] __x64_sys_ioctl+0x6a/0xa0 [852.438498] do_syscall_64+0x37/0xb0 [852.438507] entry_SYSCALL_64_after_hwframe+0x44/0xae [852.438515] RIP: 0033:0x7f5c0415b317 [852.438523] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [852.438542] RSP: 002b:00007ffd765039a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [852.438553] RAX: ffffffffffffffda RBX: 000055e4d7829dd0 RCX: 00007f5c0415b317 [852.438562] RDX: 00007ffd76503a00 RSI: 00000000c0406469 RDI: 0000000000000017 [852.438571] RBP: 00007ffd76503a00 R08: 0000000000000000 R09: 0000000000000081 [852.438579] R10: 00000000ffffff7f R11: 0000000000000246 R12: 00000000c0406469 [852.438587] R13: 0000000000000017 R14: 00007ffd76503a00 R15: 0000000000000000 [852.438598] </TASK> [852.438602] Modules linked in: snd_hda_codec_hdmi i915 mei_hdcp x86_pkg_temp_thermal snd_hda_intel snd_intel_dspcfg drm_buddy coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ttm ghash_clmulni_intel snd_hwdep snd_hda_core e1000e drm_dp_helper ptp snd_pcm mei_me drm_kms_helper pps_core mei syscopyarea sysfillrect sysimgblt fb_sys_fops prime_numbers intel_lpss_pci smsc75xx usbnet mii [852.440310] ---[ end trace e52cdd2fe4fd911c ]--- v2: Fix typos in the commit message. Fixes: 7e00897b ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.") Fixes: bc1922e5 ("drm/i915: Fix a race between vma / object destruction and unbinding") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220222133209.587978-1-thomas.hellstrom@linux.intel.com
-
Matthew Auld authored
If the user doesn't require CPU access for the buffer, then ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the non-mappable portion of LMEM, on devices with small BAR. v2(Thomas): - The BO_ALLOC_TOPDOWN naming here is poor, since this is pure lies on systems that don't even have small BAR. A better name is GPU_ONLY, which is accurate regardless of the configuration. Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by:
Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-3-matthew.auld@intel.com
-
- Feb 20, 2022
-
-
For local-memory objects we need to align the GTT addresses to 64K, both for the ppgtt and ggtt. We need to support vm->min_alignment > 4K, depending on the vm itself and the type of object we are inserting. With this in mind update the GTT selftests to take this into account. For compact-pt we further align and pad lmem object GTT addresses to 2MB to ensure PDEs contain consistent page sizes as required by the HW. v3: * use needs_compact_pt flag to discriminate between 64K and 64K with compact-pt * add i915_vm_obj_min_alignment * use i915_vm_obj_min_alignment to round up vma reservation if compact-pt instead of hard coding v5: * fix i915_vm_obj_min_alignment for internal objects which have no memory region v6: * tiled_blits_create correctly pick largest required alignment v8: * i915_vm_min_alignment protect against array overflow for mock region Signed-off-by:
Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Ramalingam C <ramalingam.c@intel.com> Signed-off-by:
Robert Beckett <bob.beckett@collabora.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-7-ramalingam.c@intel.com
-
- Jan 28, 2022
-
-
Maarten Lankhorst authored
i915_gem_vm_close may take the lock, and we currently have no better way of handling this. At least for now, allow a path in which holding vm->mutex is sufficient. This is the case, because the object destroy path will forcefully take vm->mutex now. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220128085739.1464568-1-maarten.lankhorst@linux.intel.com Reviewed-by:
Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
-
- Jan 27, 2022
-
-
Dan Carpenter authored
This "ret" declaration shadows an existing "ret" variable at the top of the function. Delete it. Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by:
Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127085115.GD25644@kili Fixes: f6c466b8 ("drm/i915: Add support for moving fence waiting")
-
- Jan 26, 2022
-
-
Thomas Hellström authored
In some cases we use leftover kfree() instead of i915_vma_resource_free(). Fix this. Fixes: 2f6b90da ("drm/i915: Use vma resources for async unbinding") Reported-by:
Robert Beckett <bob.beckett@collabora.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220119174734.213552-1-thomas.hellstrom@linux.intel.com
-
- Jan 25, 2022
-
-
Tvrtko Ursulin authored
We need to flush TLBs before releasing backing store otherwise userspace is able to encounter stale entries if a) it is not declaring access to certain buffers and b) it races with the backing store release from a such undeclared execution already executing on the GPU in parallel. The approach taken is to mark any buffer objects which were ever bound to the GPU and to trigger a serialized TLB flush when their backing store is released. Alternatively the flushing could be done on VMA unbind, at which point we would be able to ascertain whether there is potential a parallel GPU execution (which could race), but essentially it boils down to paying the cost of TLB flushes potentially needlessly at VMA unbind time (when the backing store is not known to be going away so not needed for safety), versus potentially needlessly at backing store relase time (since we at that point cannot tell whether there is anything executing on the GPU which uses that object). Thereforce simplicity of implementation has been chosen for now with scope to benchmark and refine later as required. Signed-off-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by:
Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by:
Dave Airlie <airlied@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: stable@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jan 18, 2022
-
-
Maarten Lankhorst authored
Add a flag PIN_VALIDATE, to indicate we don't need to pin and only protected by the object lock. This removes the need to unpin, which is done by just releasing the lock. eb_reserve is slightly reworked for readability, but the same steps are still done: - First pass pins with NONBLOCK. - Second pass unbinds all objects first, then pins. - Third pass is only called when not all objects are softpinned, and unbinds all objects, then calls i915_gem_evict_vm(), then pins. Changes since v1: - Split out eb_reserve() into separate functions for readability. Changes since v2: - Make batch buffer mappable on platforms where only GGTT is available, to prevent moving the batch buffer during relocations. Changes since v3: - Preserve current behavior for batch buffer, instead be cautious when calling i915_gem_object_ggtt_pin_ww, and re-use the current batch vma if it's inside ggtt and map-and-fenceable. - Remove impossible condition check from eb_reserve. (Matt) Changes since v5: - Do not even temporarily pin, just call i915_gem_evict_vm() and mark all vma's as unpinned. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-7-maarten.lankhorst@linux.intel.com
-
Maarten Lankhorst authored
Now that we require the object lock for all ops, some code handling race conditions can be removed. This is required to not take short-term pins inside execbuf. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-6-maarten.lankhorst@linux.intel.com
-
Maarten Lankhorst authored
We want to remove more members of i915_vma, which requires the locking to be held more often. Start requiring gem object lock for i915_vma_unbind, as it's one of the callers that may unpin pages. Some special care is needed when evicting, because the last reference to the object may be held by the VMA, so after __i915_vma_unbind, vma may be garbage, and we need to cache vma->obj before unlocking. Changes since v1: - Make trylock failing a WARN. (Matt) - Remove double i915_vma_wait_for_bind() (Matt) - Move atomic_set to right before mutex_unlock(), to make it more clear they belong together. (Matt) Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.william.auld@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com
-
Maarten Lankhorst authored
Because we will start to require the obj->resv lock for unbinding, ensure these vma eviction utility functions also take the lock. This requires some function signature changes, to ensure that the ww context is passed around, but is mostly straightforward. Previously this was split up into several patches, but reworking should allow for easier bisection. Changes since v1: - Handle evicting dead objects better. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com
-
Maarten Lankhorst authored
i915_gem_evict_vm will need to be able to evict objects that are locked by the current ctx. By testing if the current context already locked the object, we can do this correctly. This allows us to evict the entire vm even if we already hold some objects' locks. Previously, this was spread over several commits, but it makes more sense to commit the changes to i915_gem_evict_vm separately from the changes to i915_gem_evict_something() and i915_gem_evict_for_node(). Changes since v1: - Handle evicting dead objects better. Changes since v2: - Use for_i915_gem_ww in igt_evict_vm. (Thomas) Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> [mlankhorst: Fix up doc warning.] Link: https://patchwork.freedesktop.org/patch/msgid/20220117075604.131477-1-maarten.lankhorst@linux.intel.com
-
- Jan 11, 2022
-
-
Thomas Hellström authored
There is always a struct vma_resource guaranteed to be alive when we access a corresponding struct vma_snapshot. So ditch the latter and instead of allocating vma_snapshots, reference the already existning vma_resource. This requires a couple of extra members in struct vma_resource but that's a small price to pay for the simplification. v2: - Fix a missing include and declaration (kernel test robot <lkp@intel.com>) Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-7-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
Implement async (non-blocking) unbinding by not syncing the vma before calling unbind on the vma_resource. Add the resulting unbind fence to the object's dma_resv from where it is picked up by the ttm migration code. Ideally these unbind fences should be coalesced with the migration blit fence to avoid stalling the migration blit waiting for unbind, as they can certainly go on in parallel, but since we don't yet have a reasonable data structure to use to coalesce fences and attach the resulting fence to a timeline, we defer that for now. Note that with async unbinding, even while the unbind waits for the preceding bind to complete before unbinding, the vma itself might have been destroyed in the process, clearing the vma pages. Therefore we can only allow async unbinding if we have a refcounted sg-list and keep a refcount on that for the vma resource pages to stay intact until binding occurs. If this condition is not met, a request for an async unbind is diverted to a sync unbind. v2: - Use a separate kmem_cache for vma resources for now to isolate their memory allocation and aid debugging. - Move the check for vm closed to the actual unbinding thread. Regardless of whether the vm is closed, we need the unbind fence to properly wait for capture. - Clear vma_res::vm on unbind and update its documentation. v4: - Take cache coloring into account when searching for vma resources pending unbind. (Matthew Auld) v5: - Fix timeout and error check in i915_vma_resource_bind_dep_await(). - Avoid taking a reference on the object for async binding if async unbind capable. - Fix braces around a single-line if statement. v6: - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>) - Don't allow async unbinding if the vma_res pages are not the same as the object pages. (Matthew Auld) v7: - s/unsigned long/u64/ in a number of places (Matthew Auld) Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
A pin-count is already held by vma->pages so taking an additional pin during async binds is not necessary. When we introduce async unbinding we have other means of keeping the object pages alive. Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-4-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
When introducing asynchronous unbinding, the vma itself may no longer be alive when the actual binding or unbinding takes place. Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource instead of a struct i915_vma for the bind_vma() and unbind_vma() ops. Similarly change the insert_entries() op for struct i915_address_space. Replace a couple of i915_vma_snapshot members with their newly introduced i915_vma_resource counterparts, since they have the same lifetime. Also make sure to avoid changing the struct i915_vma_flags (in particular the bind flags) async. That should now only be done sync under the vm mutex. v2: - Update the vma_res::bound_flags when binding to the aliased ggtt v6: - Remove I915_VMA_ALLOC_BIT (Matthew Auld) - Change some members of struct i915_vma_resource from unsigned long to u64 (Matthew Auld) v7: - Fix vma resource size parameters to be u64 rather than unsigned long (Matthew Auld) Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
-
Thomas Hellström authored
Introduce vma resources, sort of similar to TTM resources, needed for asynchronous bind management. Initially we will use them to hold completion of unbinding when we capture data from a vma, but they will be used extensively in upcoming patches for asynchronous vma unbinding. v6: - Some documentation updates Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-2-thomas.hellstrom@linux.intel.com
-
- Jan 10, 2022
-
-
Jani Nikula authored
We already have the gem/i915_gem_tiling.c file. Acked-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8073a429ed1f8ade9c0cc8a6ed1a0f82183100c5.1641561552.git.jani.nikula@intel.com
-
Jani Nikula authored
We already have the i915_gem_evict.c file. v2: Fixed commit message (Tvrtko) Acked-by:
Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by:
Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ec666853171d04daeb21a93083940df36907c343.1641561552.git.jani.nikula@intel.com
-
- Dec 22, 2021
-
-
Thomas Hellström authored
Protect updates of struct i915_vma flags and async binding / unbinding with the vm::mutex. This means that i915_vma_bind() needs to assert vm::mutex held. In order to make that possible drop the caching of kmap_atomic() maps around i915_vma_bind(). An alternative would be to use kmap_local() but since we block cpu unplugging during sleeps inside kmap_local() sections this may have unwanted side-effects. Particularly since we might wait for gpu while holding the vm mutex. This change may theoretically increase execbuf cpu-usage on snb, but at least on non-highmem systems that increase should be very small. Signed-off-by:
Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211221200050.436316-5-thomas.hellstrom@linux.intel.com
-
- Dec 20, 2021
-
-
Maarten Lankhorst authored
i915_vma_wait_for_bind needs the vma lock held, fix the caller. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-5-maarten.lankhorst@linux.intel.com
-
Maarten Lankhorst authored
Big delta, but boils down to moving set_pages to i915_vma.c, and removing the special handling, all callers use the defaults anyway. We only remap in ggtt, so default case will fall through. Because we still don't require locking in i915_vma_unpin(), handle this by using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in unpin, which only fails if we race a against a new pin. Changes since v1: - aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case from __i915_vma_get_pages(). (Matt) Changes since v2: - Free correct old pages in __i915_vma_get_pages(). (Matt) Remove race of clearing vma->pages accidentally from put, free it but leave it set, as only get has the lock. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.com Reviewed-by:
Matthew Auld <matthew.auld@intel.com>
-
Maarten Lankhorst authored
When reworking the code to move the eviction fence to the object, the best code is removed code. Remove some functions that are unused, and change the function definition if it's only used in 1 place. Signed-off-by:
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by:
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> [mlankhorst: Remove new use of i915_active_has_exclusive] Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-2-maarten.lankhorst@linux.intel.com
-