- 07 Oct, 2020 40 commits
-
-
Alejandro Piñeiro authored
This is needed due Vulkan because by spec (31.1. Limit Requirements) the minimum value for the following limits are the following ones: maxPerStageDescriptorSampledImages 16 maxPerStageDescriptorStorageImages 4 maxPerStageDescriptorInputAttachments 4 And we are using v3d textures for all of them, so current limit would not be enough for some cases. Note that as the current comment explains there is not exactly a HW limit for it, so we could bump to 32 for example, but let's just be conservative and ask the minimum required. It is worth to note that we needed to maintain the same value for the OpenGL case, as it gets a register allocation failure on some GL cases. We tried to fix that with small changes on the nir scheduler, but we found that it would require some non-trivial effort to get it done (that eventually we would need to). Fixes tests like: dEQP-VK.binding_model.descriptorset_random.sets16.constant.ubolimitlow.sbolimitlow.imglimitlow.noiub.uab.comp.noia.0 v2: keep the previous limit for Opengl (Eric) Reviewed-by:
Eric Anholt <eric@anholt.net> Part-of: <!6999>
-
Tony Wasserka authored
Shaders may read out components past the attributes provided by the application, so the read mask can indicate a larger component count than were actually reserved in the array. Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
Previously, ycbcr samplers were tightly packed with 4-byte alignment, but the structure requires 8-byte alignment. These samplers are now padded to 8-byte boundaries instead. Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
"max_bindings + 1" was repeatedly used throughout this function, so talking about the binding *count* is more natural here. Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
Vulkan allows for these input pointers to be null when the respective object count is zero. Calling memcpy with null pointers is undefined, so they are guarded with a check for the legit use pattern now. Reviewed-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
This was observed with the intel vulkan driver when running dEQP-VK.spirv_assembly.instruction.compute.float32.comparison_1.modfstruct with ubsan enabled. Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!6728>
-
Tony Wasserka authored
Notably this happened when applying constant folding on the intermediate computations generated from nir_lower_idiv. Reviewed-by:
Rhys Perry <pendingchaos02@gmail.com> Part-of: <!6728>
-
Dave Airlie authored
Reviewed-by:
Michel Dänzer <mdaenzer@redhat.com> Part-of: <!7017>
-
Marek Olšák authored
Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com> Part-of: <!6955>
-
Marek Olšák authored
Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com> Part-of: <!6955>
-
Marek Olšák authored
Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com> Part-of: <!6955>
-
Marek Olšák authored
the only limitation is that key=0 is not allowed Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com> Part-of: <!6955>
-
Boris Brezillon authored
Mali v6 (G72) doesn't support constants in blend equations, let's use a shader in that case. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
The fixed-function blend logic uses the following equation: A + B x C. A, B and C are configurable and can be complemented with negation (for A and B) or inversion (for C) modifiers. Let's rework the blending code to take that into account. Note that we need to update the checksum of a few traces because the equations we use have changed, leading to small deviations on the final images. Indeed, there are several valid options for a given GL blend equation, but the operand selection probably has an impact on the rounding, leading to those mismatch. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
To signify when a struct is not meant to be packed directly but should instead be embedded in another struct. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
While at it, we also split the midgard and bifrost handling since there's not much to share. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Add missing fields and rename some of the existing ones. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Boris Brezillon authored
Add missing fields, and rename some of the existing fields. Signed-off-by:
Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by:
Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <!6980>
-
Marek Olšák authored
why would you want anything else The only platform significantly affected by this is Intel where `lower_idiv` is not set today but neither is `lower_bitops`. There it seems to still be a boon over-all. Shader-db results on Ice Lake: total instructions in shared programs: 19719051 -> 19735766 (0.08%) instructions in affected programs: 106992 -> 123707 (15.62%) helped: 0 HURT: 445 HURT stats (abs) min: 3 max: 295 x̄: 37.56 x̃: 44 HURT stats (rel) min: 0.16% max: 33.33% x̄: 19.60% x̃: 19.38% 95% mean confidence interval for instructions value: 33.60 41.53 95% mean confidence interval for instructions %-change: 18.97% 20.23% Instructions are HURT. total loops in shared programs: 5973 -> 5973 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 489405810 -> 486917482 (-0.51%) cycles in affected programs: 4759097 -> 2270769 (-52.29%) helped: 406 HURT: 34 helped stats (abs) min: 2 max: 64661 x̄: 6291.95 x̃: 3126 helped stats (rel) min: 0.02% max: 79.42% x̄: 43.32% x̃: 55.83% HURT stats (abs) min: 2 max: 29376 x̄: 1947.12 x̃: 30 HURT stats (rel) min: 0.04% max: 23.82% x̄: 4.66% x̃: 1.33% 95% mean confidence interval for cycles value: -6753.06 -4557.52 95% mean confidence interval for cycles %-change: -42.60% -36.63% Cycles are helped. total spills in shared programs: 12481 -> 12482 (<.01%) spills in affected programs: 47 -> 48 (2.13%) helped: 0 HURT: 1 total fills in shared programs: 12816 -> 12819 (0.02%) fills in affected programs: 71 -> 74 (4.23%) helped: 0 HURT: 1 total sends in shared programs: 1010124 -> 1010124 (0.00%) sends in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 1 GAINED: 0 Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net> Part-of: <!6963>
-
Marek Olšák authored
We need to enable and bind everything on the glthread side too. The behavior was copied from _mesa_InterleavedArrays. Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Part-of: <!6874>
-
Marek Olšák authored
This is an optimization for SPECviewperf. The increase in lines of code is only 14%. Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Part-of: <!6874>
-
Marek Olšák authored
GET_DISPATCH returns CurrentClientDispatch, which invokes glthread if it's enabled. GL function implementations should never call back to glthread. Reviewed-by:
Ian Romanick <ian.d.romanick@intel.com> Part-of: <!6874>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Karol Herbst authored
Reviewed-by:
Serge Martin <edb@sigluy.net> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Karol Herbst authored
Reviewed-by:
Serge Martin <edb@sigluy.net> Acked-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
This make CTS test_compiler happier Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Karol Herbst <kherbst@redhat.com> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Pierre Moreau <dev@pmoreau.org> Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-
Serge Martin authored
Reviewed-by:
Francisco Jerez <currojerez@riseup.net> Part-of: <!4974>
-