Commits · master · Tomeu Vizoso / mesa

Jun 03, 2021

TEMP: Add profile jobs for Iris boards · 880d1919
Tomeu Vizoso authored Jun 01, 2021
```
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
```
880d1919

util/format: Change the pointer offset. · 8251bd21

Sergii Melikhov authored May 27, 2021

Changed the pointer offset to 2 to account for the second structure variable.

Fixes: 90f98b56 ("mesa: Deduplicate _mesa_pack_uint_z_row().")
Closes: mesa/mesa#4685


Signed-off-by: Sergii Melikhov <sergii.v.melikhov@globallogic.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <mesa/mesa!11060>

8251bd21

vulkan/wsi: provide more info in wsi_image_create_info · 447e80ac

Chia-I Wu authored May 12, 2021



Always chain wsi_image_create_info to VkImageCreateInfo, which indicates
that the image is a wsi image and can be transitioned to/from
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

Add prime_blit_buffer to the struct as well.  When set, it indicates the
prime blit destination and implies that the image is a prime blit
source.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <mesa/mesa!10789>

447e80ac

intel/gfx6: move xfb_setup outside the gs compiler into the driver. · 64fa67dd

Dave Airlie authored May 24, 2021



This remove the use of a GL thing from the backend compiler

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <!11097>

64fa67dd

aco: don't create 4 and 5 dword NSA instructions on GFX10 · 903f814b

Rhys Perry authored Jun 02, 2021

"stability issues", apparently: https://reviews.llvm.org/D103348



fossil-db (Navi10):
Totals from 4512 (3.01% of 149839) affected shaders:
VGPRs: 221516 -> 223308 (+0.81%); split: -0.07%, +0.88%
CodeSize: 23000080 -> 23070672 (+0.31%); split: -0.08%, +0.39%
MaxWaves: 107718 -> 107496 (-0.21%); split: +0.11%, -0.32%
Instrs: 4321890 -> 4362822 (+0.95%); split: -0.00%, +0.95%
Latency: 71495710 -> 71581476 (+0.12%); split: -0.07%, +0.19%
InvThroughput: 11858568 -> 11938960 (+0.68%); split: -0.00%, +0.68%
VClause: 76575 -> 76585 (+0.01%); split: -0.05%, +0.07%
SClause: 168771 -> 168709 (-0.04%); split: -0.06%, +0.02%
Copies: 182305 -> 221948 (+21.75%); split: -0.00%, +21.75%
PreVGPRs: 194657 -> 195635 (+0.50%); split: -0.00%, +0.50%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c ("aco: use non-sequential addressing")
Part-of: <mesa/mesa!10898>

903f814b

aco/tests: improve reporting of failed code checks · bb52484d

Rhys Perry authored May 20, 2021



Instead of just reporting the failed statements, print where they
originated. This is useful for tests which have a number of similar
checks.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <mesa/mesa!10898>

bb52484d

aco/tests: add tests for form_hard_clauses() · 9bf30c4a

Rhys Perry authored May 20, 2021



Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <mesa/mesa!10898>

9bf30c4a

aco: do not clause NSA instructions · 81162265

Rhys Perry authored May 19, 2021

According to LLVM, this has "unpredictable results on GFX10.1".

https://reviews.llvm.org/D102211



fossil-db (Navi10):
Totals from 26690 (17.81% of 149839) affected shaders:
CodeSize: 167935160 -> 167706280 (-0.14%); split: -0.14%, +0.00%
Instrs: 31801427 -> 31744142 (-0.18%); split: -0.18%, +0.00%
Latency: 732672435 -> 732622463 (-0.01%)
InvThroughput: 163361435 -> 163357838 (-0.00%); split: -0.00%, +0.00%
VClause: 546131 -> 546903 (+0.14%); split: -0.00%, +0.14%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Fixes: c353895c ("aco: use non-sequential addressing")
Part-of: <mesa/mesa!10898>

81162265

freedreno/a6xx: Fix mh31 intermittent faults · e0488535

Rob Clark authored Jun 02, 2021



It appears that CP can over-fetch push constants slightly.  While it
otherwise has no problem fetching from an alignment of 32 bytes, if that
32 bytes is at the end of a mapped bo, this can trigger fetching up to
32 bytes beyond the patch, triggering an iova fault.  While otherwise
"harmless", it is probably better to not have random intermittent
faults.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <mesa/mesa!11142>

e0488535

docs/freedreno: Rewrite the section on array access. · 3b195459

Emma Anholt authored Jun 02, 2021

We don't use collect/split for array access these days, instead use
ir3_array structs that the ir3_register can point to.

Part-of: <mesa/mesa!11147>

3b195459

docs/freedreno: Update for the fanin/fanout -> collect/split rename. · 95cffbcd
Emma Anholt authored Jun 02, 2021
```
See 611258d5 ("freedreno/ir3: rename fanin/fanout to collect/split")

Part-of: <mesa/mesa!11147>
```
95cffbcd
ci/freedreno: Add some more known flakes from recent marge runs. · d3e419f9
Emma Anholt authored Jun 02, 2021
```
Part-of: <mesa/mesa!11144>
```
d3e419f9
docs: update calendar and link releases notes for 21.1.2 · 7949ff56
Eric Engestrom authored Jun 02, 2021
```
Part-of: <mesa/mesa!11148>
```
7949ff56
docs: add release notes for 21.1.2 · e0ad9f43
Eric Engestrom authored Jun 02, 2021
```
Part-of: <mesa/mesa!11148>
```
e0ad9f43
intel/fs: Handle non-perspective-correct interpolation on gen4-5 · f5e58838
Faith Ekstrand authored Apr 24, 2020
```
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <!11125>
```
f5e58838

st/nir: always revectorise if scalarising happens. · 1956ff08

Dave Airlie authored May 18, 2021



This fixes arb_gpu_shader_fp64-vs-non-uniform-control-flow-ssbo
on crocus.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <!11098>

1956ff08

zink: Add a missing VKAPI_ATTR. · 1fe0bb53

Georg Lehmann authored Jun 01, 2021



Closes #4868

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reported-by: Roman Stratiienko <r.stratiienko@gmail.com>
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <!11115>

1fe0bb53

llvmpipe: add the interesting bit of cpu detection to the cache. · 9520b70f

Dave Airlie authored May 24, 2021



This should detect if someone changes CPU configuration that matters like in a VM

Reviewed-by: Emma Anholt <emma@anholt.net>
Fixes: 6c0c61cb ("llvmpipe: add infrastructure for disk cache support")
Part-of: <mesa/mesa!10946>

9520b70f

u_format: Use the computed BE channels/swizzles for bitmask formats. · 74034635

Emma Anholt authored Apr 28, 2021



No more error-prone encoding of swizzles in the .csv for non-planar
formats!

No change to generated u_format_table.c

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

74034635

u_format: Sanity check that BE swizzles are appropriately mapped from LE. · 1c199726

Emma Anholt authored Apr 28, 2021



Once you read enough of them, there's an obvious pattern that we can just
write a little code for instead of making every dev write it out each time.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

1c199726

u_format: Sanity check the BE channels for all bitmask formats. · 36569b9f

Emma Anholt authored Apr 27, 2021



Just check against the CSV (which has its codegen now tested with
u_format_test in CI) for now, so we know that our computed channels are
correct.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

36569b9f

u_format: Fix the BE channel ordering for R5G5B5A1_UINT. · 9d77cecf

Emma Anholt authored Apr 27, 2021



It notably didn't fit the pattern of RGB5_A1_UNORM, and violated the
general pattern for bitmask format BE channels (channels are ordered
right-to-left in the BE columns in the CSV due to the parser walking them
in that order for historical reasons).

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

9d77cecf

u_format: Define tests for r3g3b2 formats and fix BE swizzles for them. · 4dac360d

Emma Anholt authored Apr 27, 2021



These tests passed for LE, and the BE channel ordering specified obviously
didn't fit the pattern of the other BE formats (channels are listed
right-to-left in the BE columns for historical reasons).

Note that we can't write pure-integer format tests in u_format_tests.c
currently.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

4dac360d

u_format: Assert that array formats don't include BE swizzles. · c144f988

Emma Anholt authored Apr 27, 2021



Z32_FLOAT_S8X24_UINT and X32_S8X24_UINT are in fact the only non-bitmask
formats that have BE swizzles specified, but sorting out those two is
harder.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

c144f988

u_format: Use the nice helper for reversing an array. · 397e8076

Emma Anholt authored Jun 01, 2021



Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

397e8076

u_format: Move the BE swizzle computation into Format init. · c8ef4f36

Emma Anholt authored Apr 27, 2021



I wanted to do the next set BE changes here where I have Format's helper
functions available.

No changes in generated u_format_table.c.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

c8ef4f36

u_format: Drop redundant .name init. · 8a407804

Emma Anholt authored Apr 27, 2021



It's the first member that's set.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

8a407804

u_format: Fix some pep8 in u_format_parse.py. · a7fdddb1

Emma Anholt authored Apr 27, 2021



My editor likes to enforce pep8, here's some low hanging fruit so I don't
have to do too much add -p.

Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <mesa/mesa!10505>

a7fdddb1

Jun 02, 2021

turnip: fix register_index calculations of xfb outputs · b71e27ea

Danylo Piliaiev authored Jun 01, 2021



nir_assign_io_var_locations() does not use outputs_written when
assigning driver locations. Use driver_location to avoid incorrectly
guessing what locations it assigned.

Copied from lavapipe 8731a1be

Will fix provoking vertex tf tests when VK_EXT_provoking_vertex
would be enabled:
 dEQP-VK.rasterization.provoking_vertex.transform_feedback.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <!11111>

b71e27ea

turnip: emit vb stride dynamic state when it is dirty · 551d7fdd

Danylo Piliaiev authored Jun 02, 2021

Due to incorrect condition we never emitted vb stride
if state was dynamically set.

Fixes vertex explosion with Zink.

See #4738



Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Part-of: <!11133>

551d7fdd

iris: Use bo->mmap_mode in transfer map read check · ccfde508

Kenneth Graunke authored May 26, 2021



The scenario we want to avoid is reading from WC or UC mappings,
so this is an easier to follow check.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

ccfde508

iris: Pick a single mmap mode (WB/WC) at BO allocation time · f62724cc

Kenneth Graunke authored May 20, 2021

Previously, iris_bufmgr had the ability to maintain multiple
simultaneous memory mappings for a BO, one in WB mode (with CPU caches),
and another in WC (streaming) mode.  Depending on the flags passed to
iris_bo_map(), we would select one mode or the other.

The rules for deciding which to use were:

- Systems with LLC always use WB mode because it's basically free
- Non-LLC systems used...
  - WB maps for all BOs where snooping is enabled (which translates to
    when BO_ALLOC_COHERENT is set at allocation time)
  - WB maps for reads unless persistent, coherent, async, or raw.
  - WC maps for everything else.

This patch simplifies the system by selecting a single mmap mode at
BO allocation time, and always using that.  Each BO now has at most one
map at a time, rather than up to two (or three before we deleted GTT
map support in recent patches).

In practical terms, this eliminates the capability to use WB maps for
reads of non-snooped BOs on non-LLC systems.  Such reads would now be
slow, uncached reads.  However, iris_transfer_map recently began using
staging blits for such reads - so the GPU copies the data to a snooped
buffer which will be mapped WB.  So, rather than incurring slow UC
reads, we really just take the hit of a blit, and retain fast reads.

The rest of the rules remain the same.

There are a few reasons for this:

1. TTM doesn't support mapping an object as both WB and WC.  The
   cacheability is treated as a property of the object, not the map.
   The kernel is moving to use TTM as part of adding discrete local
   memory support.  So it makes sense to centralize on that model.

2. Mapping the same BO as both WB and WC is impossible to support on
   some CPUs.  It works on x86/x86_64, which was fine for integrated
   GPUs, but it may become an issue for discrete graphics paired with
   other CPUs (should that ever be a thing we need to support).

3. It's overall much simpler.  We only have one bo->map field, and
   manage to drop a significant amount of boilerplate.

One issue that arises is the interaction with the BO cache: BOs with
WB maps and WC maps will be lumped together into the same cache.  This
means that a cached BO may have the wrong mmap mode.  We check that,
and if it doesn't match, we unmap it, waiting until iris_bo_map is
called to restore one with the desired mode.  This may underutilize
cache mappings slightly on non-LLC systems, but I don't expect it to
have a large impact.

Closes: mesa/mesa#4747


Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

f62724cc

iris: Delete GTT mapping support · 22bfb535

Kenneth Graunke authored May 20, 2021



In the bad old days, i965 used GTT mapping for detiling maps.  iris
never has, however.  The only reason it used GTT maps was in weird
fallback cases for dealing with BO imports from foreign memory.  We
now do staging blits for those, and never mmap them.

There are no more users of GTT mapping, so we can delete it.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

22bfb535

iris: Drop fallback GEM_MMAP_GTT if GEM_MMAP with I915_MMAP_WC fails · 2f30cf4a

Kenneth Graunke authored May 20, 2021

XXX: This is actually wrong. The dmabuf imported case can be mapped via
GEM_MMAP_GTT if the iommu is working, according to Joonas, but GEM_MMAP
would fall over and fail. So we would need this fallback.
ALTERNATIVELY...we would need to flag such imported dmabufs as
unmappable, and then make iris_transfer_map/unmap always do blits
instead of direct mappings. That seems like the saner approach

We never want to use GEM_MMAP_GTT, as it does detiling maps, and iris
always wants direct maps. There were originally two cases that this
fallback path was attempting to handle:

1. The BO was allocated from stolen memory that we can't GEM_MMAP.

At one point, kernel patches were being proposed to use stolen
memory for userspace buffers, but these never landed. The kernel
has never given us stolen memory, so we cannot hit this case.

2. Imported objects may be from memory we can't GEM_MMAP.

For example, a DMABUF from a discrete AMD/NVIDIA GPU in a PRIME
setup would be backed by memory that we can't GEM_MMAP. We could
try and mmap these directly with GEM_MMAP_GTT, but that relies on
the IOMMU working. We could mmap the DMABUF fd directly (but have
never tried to do so), but there are complex rules there. Instead,
we now flag those imports, however, and rely on the iris_transfer_map
code to perform staging blits on the GPU, so we never even try to
map them directly. So this case won't reach us here any longer.

With both of those out of the way, there is no need for a fallback.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

2f30cf4a

iris: Assert on mapping a tiled buffer without MAP_RAW · 05a43d42

Kenneth Graunke authored May 20, 2021



iris has never relied on detiled maps using hardware fences.
This code is a remnant of i965, where that was actually used.

We can just assert that callers don't do such a thing.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

05a43d42

iris: Use staging blits for transfers involving imported BOs · 3319ab0d

Kenneth Graunke authored May 20, 2021

Direct mappings of imported DMABUFs can be tricky. If they're allocated
from our own device, then we can probably mmap them and it'd be fine.
But they may come from a different device (such as a discrete GPU), in
which case I915_GEM_MMAP wouldn't work, I915_GEM_MMAP_GTT would require
a working IOMMU, and directly mmap'ing the DMABUF fd would come with a
bunch of rules and restrictions which are hard to get right.

CPU mapping an imported DMABUF image for writes seems very uncommon,
solidly in the "what are you even doing?" realm. Mapping an imported
DMABUF for reading might be a thing, in case someone wanted to do
glReadPixels on it. But in that case, the cost of doing a staging
blit is probably acceptable.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

3319ab0d

iris: Use staging blits for reads from uncached buffers. · 643c4ade

Kenneth Graunke authored May 20, 2021



If we're doing CPU reads of a resource that doesn't have CPU caches
enabled for the mapping (say, in device local memory, or WC mapped),
then blit it to a temporary that does have those caches enabled.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

643c4ade

iris: Track imported vs. exported status separately · 49070038

Kenneth Graunke authored May 20, 2021



Not all external objects are the same.  Imported buffers may be from
other devices (say a dmabuf from an AMD or NVIDIA discrete card) which
are backed by memory that we can't use with I915_GEM_MMAP.  However,
exported buffers are ones that we know we allocated ourselves from our
own device.  We may not know what other clients are doing with them,
but we can assume a bit more about where they came from.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

49070038

iris: Make an iris_bo_is_external() helper and use it in a few places · 1a395e10

Kenneth Graunke authored May 20, 2021



I'd like to start tracking "imported" vs. "exported" for objects,
rather than a blanket "external" flag.  Instead of directly checking
bo->external, use a new helper that will eventually be "imported or
exported".

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

1a395e10

iris: Delete a comment suggesting we use tiled staging buffers · 1c73445d

Kenneth Graunke authored May 20, 2021



We basically tried this, and it performed worse, so delete the
suggestion in the comments that we may want to do it someday.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <mesa/mesa!10941>

1c73445d

Admin message