- Jan 13, 2020
-
-
Tomeu Vizoso authored
-
Tomeu Vizoso authored
-
Tomeu Vizoso authored
It pollutes the output of programs that use Panfrost and can confuse its callers, such as test runners. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
To avoid hitting the assert in the default case, add a nop for this intrinsic. dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.3 Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
-
Tomeu Vizoso authored
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
-
Tomeu Vizoso authored
We are able to run only 1/5th of the tests in around the same time that dEQP-GLES2 takes, so do that for now while more DUTs are installed. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Tomeu Vizoso authored
Use the normal build job to also prepare the artifacts for LAVA jobs. For that, the build container needs to also build the test suites, kernel, ramdisk, etc. Then the build job will place the just-built Mesa in the ramdisk and the test job can generate a LAVA job and point to those artifacts. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
-
Fix different use cases for transform feedback by setting: - PIPE_CAP_PACKED_STREAM_OUTPUT=0 - PIPE_CAP_VIEWPORT_TRANSFORM_LOWERED=1 - PIPE_CAP_PSIZ_CLAMPED=1 This is enough for all dEQP xfb-related test cases to run successfully. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
This new capability indicates that the point size has been clamped. This also means that the gl_PointSize has been modified and that its value should be lowered for transform feedback, if needed. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
This new capability indicates that the nir_lower_viewport_transform pass is enabled. This also means that the gl_Position value is modified and should be lowered for transform feedback, if needed. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
Setting this cap to 0 (default is 1) should disable packing optimization for stream output (e.g. GL transform feedback captured variables). Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
Some lowering passes modify the value of built-in variables in order for drivers to work properly. However, modifying such values will also break transform feedback as the captured value won't match what's expected. For example, on some hardware, the vertex shaders are expected to output gl_Position in screen space. However, the transform feedback captured value is still supposed to be the world-space coordinates (see nir_lower_viewport_transform). To fix that, we create a new variable that contains the pre-transformation value and use it for transform feedback instead of the built-in one. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
When varying packing is disabled for transform feedback and a xfb declaration points to an array element or structure member, the element/member should be aligned to the start of a slot as well. If that's not the case, a new varying is created and the element/member value is copied. There might a way to further optimize the number of slots allocated or the number of copies necessary if the performance cost is problematic. For example, in cases where simply padding the top level variable might correctly align all the captured values. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
Some drivers (e.g. Panfrost) don't support packing of varyings when used for transform feedback. This new constant ensures that any varying used for xfb is aligned at the start of a slot and won't be packed with other varyings. Scenarios where transform feedback declarations are related to an array element or a struct member will be handled in a subsequent patch. Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
-
The per-batch headers/gpu_headers dynarrays need to be freed during the batch cleanup to prevent leaking. Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <!3308> Part-of: <!3308>
-
The bo access needs to be freed prior to removing it from its hash table. This prevents leaking them over time. Signed-off-by: Daniel Ogorchock <daniel.ogorchock@garmin.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <mesa/mesa!3308>
-
Samuel Pitoiset authored
This field is for the primitive ID export to the fragment shader. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
Only needed for NGG without passthrough mode or for NGG streamout. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
It can't be enabled for geometry shaders, for NGG streamout and for vertex shaders that export the primitive ID. NGG passthrough requires that LDS isn't used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
RadeonSI and AMDVLK does that. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
- Jan 12, 2020
-
-
Ilia Mirkin authored
Per the semi-recently-released NVIDIA docs, when this bit is not enabled, then the result for RT[0] will be used. So if e.g. only a single RT is drawn to and it's not RT[2], the results will not be visible. Fixes GTF-GL45.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline which was failing due to a frag shader outputting only to location=2. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
-
Ilia Mirkin authored
This corresponds to gl_PrimitiveID and gl_Layer. When both of these are stored in a single AST.64 or AST.128 operation, then it appears as though the whole store fails. Fixes the recently extended glsl-1.50-transform-feedback-builtins piglit, and also gtf30.GL3Tests.transform_feedback.transform_feedback_builtins. The issue was reproduced on GM107 and GP108 but not GK208 nor GK104. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
-
Ilia Mirkin authored
Perhaps in a future implementation, such events could be passed back to the driver, or queried directly. However for now, this is required for GL 4.3 robustness contexts. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
-
Ilia Mirkin authored
The fix was found by Karol Herbst a long time ago, but it was unclear why it helped or if it would create additional problems. This change adds a comment that explains what's going on, and in the process also normalizes the nv50 implementation to match. The coordinates which are fed to gl_Position map directly to pixel coordinates, since the viewport transform is disabled. If the framebuffer is MSAA, then that doesn't affect the pixel coordinates at all, it's just that each pixel has multiple samples. Note that this makes it really clear that this approach is inappropriate for EXT_framebuffer_multisample_blit_scaled, and also the 3d path will fail terribly for direct copies. Thankfully the 2d path normally takes care of this. Fixes KHR-GL43.packed_depth_stencil.blit.depth32f_stencil8 as well as scaling issues in a number of EXT_framebuffer_multisample-related piglit tests (although they continue to fail due to inaccuracies). Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
-
Bas Nieuwenhuizen authored
This updates for the new metadata ABI in radeonsi. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <mesa/mesa!3244> Part-of: <mesa/mesa!3244>
-
Vasily Khoruzhick authored
lima doesn't support alpha test, flat shading, two-sided color nor clip planes. We can enable these caps when corresponding hw features are implemented in the driver. Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Vasily Khoruzhick authored
Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A. Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Vasily Khoruzhick authored
Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport is still rendered. Looks like we need to use scissors to do clipping. Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures fail on blob as well. Remaining [1] fails on many other gallium drivers. [1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Vasily Khoruzhick authored
Apparently it doesn't depend on primitive type, the value only depends on whether we specify point size via PLBU command -- bit 12 is set in this case Reviewed-by: Qiang Yu <yuq825@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de> Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
-
Timothy Arceri authored
The state value of main_uniform_storage_index will be wrong for add_parameter() when find_and_update_previous_uniform_storage() finds a uniform if there is more than 1 uniform used in multiple shader stages. The new code is also simpler. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
-
- Jan 11, 2020
-
-
Christian Gmeiner authored
This new debug option will fake some driver CAPs to be able to run dEQP for GLES3. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Jonathan Marek <jonathan@marek.ca> Tested-by: Marge Bot <mesa/mesa!3351> Part-of: <mesa/mesa!3351>
-
Timur Kristóf authored
The output of v_cmp instructions is s1 (a single SGPR) in wave32 mode, as opposed to s2 (an SGPR-pair) in wave64 mode. A couple of cases where this should have been fixed were omitted from the previous patch by mistake. Fixes: e0bcefc3 Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
-
- Jan 10, 2020
-
-
Alyssa Rosenzweig authored
...in case we have arrays in a UBO block that we'd like to access indirectly. Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Marge Bot <mesa/mesa!3352> Part-of: <mesa/mesa!3352>
-
Francisco Jerez authored
This will be convenient in a later commit enabling SIMD32 fragment shaders, and happens to fix the calculation for MATH instructions which is currently inaccurate for SIMD-lowered instructions on Gen4-5 platforms (all of them on Gen4 in SIMD16 mode), since it was based on the shader's dispatch width rather than on the actual execution size of the instruction. This causes some shader-db noise on Gen4 due to the more compact register allocation interacting with the SEND dependency workarounds, but otherwise no major changes. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Francisco Jerez authored
The liveness calculation done by the local CSE pass in order to prune AEB entries whose sources are no longer live is currently inaccurate, because the live intervals are calculated once at the beginning of the pass, so they don't take into account any of the copy instructions inserted by the CSE pass as it makes progress. However the IP counter used in that calculation is based on the start_ip of the basic block, which is updated automatically whenever any instructions are inserted into the CFG. This causes the IP counter and liveness intervals to get out of sync in programs with multiple basic blocks, causing the CSE pass to toss AEB entries prematurely, which can lead to missed optimization opportunities rather non-deterministically. On BDW this leads to the following shader-db changes: total instructions in shared programs: 14952488 -> 14951763 (-0.00%) instructions in affected programs: 45416 -> 44691 (-1.60%) helped: 40 HURT: 4 total spills in shared programs: 20989 -> 20970 (-0.09%) spills in affected programs: 103 -> 84 (-18.45%) helped: 3 HURT: 0 total fills in shared programs: 24981 -> 24926 (-0.22%) fills in affected programs: 127 -> 72 (-43.31%) helped: 3 HURT: 0 In addition it avoids a number of regressions in combination with some of the optimization changes I'm working on for SIMD32, which would have made CSE more effective... Causing it to be less effective elsewhere in the program astonishingly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-
Francisco Jerez authored
For uniform sample ID, only the first channel of msg_data will be initialized. We need to pass that component only to the SEND message for SIMD lowering to unzip the descriptor source correctly. Fixes several dozens of conformance test failures with SIMD32 fragment shaders enabled, including: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.dynamic_sample_number.* Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
-