- Aug 20, 2019
-
-
Erico Nunes authored
ra_get_best_spill_node is what other users of the mesa register allocator use. Switching to it now also fixes an infinite loop issue with ppir regalloc with the ppir control flow patchset, and also provides a small gain over the previous herusitic on number of spilled nodes testing with shader-db. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Emma Anholt authored
Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Emma Anholt authored
The SSE2 executor was removed in 4eb3225b ("Remove tgsi_sse2.") Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Emma Anholt authored
This was fixed in 912ed84f ("tgsi: move to using vector for system values.") Acked-by: Eric Engestrom <eric@engestrom.ch> Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
-
Emma Anholt authored
The GLES2 CTS takes about 8 minutes of total runtime (at parallel 4 is ~2 minutes in the test stage if runners are free), while GLES3 takes about 25. Since the GLES3 run is pretty expensive, just do a cheap touch test of 1 out of every 10 tests in the test list on MRs, until we can get the runtime down. v2: Drop the full run for now until we can bring runtime down or bring up a dedicated mesa runner. Reviewed-by: Eric Engestrom <eric@engestrom.ch> (v1) Reviewed-By: Gert Wollny <gert.wollny@collabora.com> (v1)
-
José María Casanova Crespo authored
This fixes the regression introduced on "mesa: refactor compressed_tex_sub_image function" that started to crash KHR-GLES2.texture_3d.compressed_texture.negative_compressed_tex_sub_image Fixes: 7df233d6 ("mesa: refactor compressed_tex_sub_image function") Reviewed-by: Eric Anholt <eric@anholt.net>
-
Adam Jackson authored
These are redundant with glx_config::renderType, let's just use that consistently.
-
Adam Jackson authored
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
-
Adam Jackson authored
Simpler, less failure prone, less malloc overhead, what's not to like. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
-
Adam Jackson authored
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
-
Adam Jackson authored
'minimum_size' is not, in fact, an argument to this function. Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
-
Arcady Goldmints-Orlov authored
In a descriptor set inline uniform blocks don't use up any bindings. However, the presence of any inline uniform blocks doed require the use of the descriptor buffer, which takes up one binding. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Daniel Schürmann authored
This pass expects the shader to be in LCSSA form. The algorithm is based on 'The Simple Divergence Analysis' from Diogo Sampaio, Rafael De Souza, Sylvain Collange, Fernando Magno Quintão Pereira. Divergence Analysis. ACM Transactions on Programming Languages and Systems (TOPLAS) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
-
The behavior for reductions with cluster_size >= subgroup_size is implementation defined. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
ACO depends on LCSSA phis for divergent booleans to work correctly. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
-
Daniel Schürmann authored
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: 414148cd "nir: Support deref instructions in loop_analyze"
-
The implementation introduced in "tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)" updates all the TGSI properties, but it didn't take into account that the shader_info structure uses a union to store the different attributes for each shader stage. Now we only update the attributes if they affect current shader stage, avoiding to overwrite members of the union that should be overwritten. This has created hundreds of regressions in v3d. For example the TGSI_PROPERTY_VS_BLIT_SGPRS_AMD was overwritting the same position used by TGSI_PROPERY_CS_FIXED_BLOCK_DEPTH. Fixes: e3003651 ("tgsi_to_nir: be careful about not losing any TGSI properties silently (v2)") Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Samuel Pitoiset authored
CLEAR_STATE emits it for us. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
It's emitted by the kernel. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Gert Wollny authored
If a drawbuffer is an fbo without an attachment then its 'Height' will be zero, and we have to take its 'DefaultGeometry.Height' into account. Fixes on softpipe (with the exception of tests that use multisample): dEQP-GLES31.functional.fbo.no_attachments.* Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
v2: Use GEN_GEN in iris_state (Kenneth Graunke) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
-
Create separate SURFACE_STATE for render target read in order to support non coherent framebuffer fetch on broadwell. Also we need to resolve framebuffer in order to support CCS_D. v2: Add outputs_read check (Kenneth Graunke) v3: 1) Import Curro's comment from get_isl_surf 2) Rename get_isl_surf method 3) Clean up allocation in case of failure Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
All helper functions are ported from i965 driver. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
v2: Add missing space (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
This will be used in next patches for supporting non coherent framebuffer fetch on Broadwell. v2: Fix comment (Kenneth Graunke) v3: 1) Fix a few nits (Caio) 2) Add comment (Caio) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
When building Mesa against a recent LLVM 10 with C++11, the build fails if the AMD common code is built as well due to "std::index_sequence" being undeclared. LLVM requires a minimum of C++14. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Acked-by: Eric Engestrom <eric@engestrom.ch>
-
Rob Herring authored
The kernel now supports madvise ioctl to indicate which BOs can be freed when there is memory pressure. Mark BOs purgeable when they are in the BO cache. The BOs must also be munmapped when they are in the cache or they cannot be purged. We could optimize avoiding the madvise ioctl on older kernels once the driver version bump lands, but probably not worth it given the other driver features also being added. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
Rob Herring authored
Sync the panfrost_drm.h UAPI header with the latest from the kernel. This adds madvise ioctl and GPU feature params. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Signed-off-by: Rob Herring <robh@kernel.org>
-
- Aug 19, 2019
-
-
Pierre-Eric Pelloux-Prayer authored
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Pierre-Eric Pelloux-Prayer authored
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Pierre-Eric Pelloux-Prayer authored
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Pierre-Eric Pelloux-Prayer authored
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Pierre-Eric Pelloux-Prayer authored
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Pierre-Eric Pelloux-Prayer authored
Combine compressed_tex_sub_image, compressed_tex_sub_image_error and compressed_tex_sub_image_no_error in a single function. The added "enum tex_mode mode" parameter allows to implement the DSA / non-DSA variants and their error/no_error combination. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
-
Bas Nieuwenhuizen authored
Took the freedom to enable dfsm even though I don't have benchmark results yet, but it seems Raven-like. Rest is from radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
-
Marek Olšák authored
This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.
-