- 06 Jul, 2018 4 commits
-
-
Jon Turney authored
per POSIX, limits.h may define PAGE_SIZE when the value is not indeterminate v2: just change the variable name, since there's no intended correlation here between this value and the machine's actual page size. Signed-off-by:
Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by:
Scott D Phillips <scott.d.phillips@intel.com>
-
Samuel Pitoiset authored
For merged shaders, VS as HS for example. Cc: <mesa-stable@lists.freedesktop.org> Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Ian Romanick authored
The bug fixed by the previous commit went undetected because extra stderr messages are not flagged by the CI. Copy the solution from fs_visitor::nir_emit_instr and mark the default case unreachable. An alternate solution is to delete the default case so that the compiler will issue a warning. That may require more work since there are other (impossible) cases that exist. Signed-off-by:
Ian Romanick <ian.d.romanick@intel.com> Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net>
-
Ian Romanick authored
Some of the lowering passes, nir_lower_locals_to_regs for example, can cause some previously live code to be dead. This pass in particular leaves a bunch of nir_instr_type_deref instructions floating around. This causes shader-db runs on Gen5 through Haswell to spew tons of messages like: VS instruction not yet implemented by NIR->vec4 UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that generates these messages. Cleaning up the dead code fixes that. To verify, I did a shader-db before and after. Even though all the messages are gone, the results make my brain hurt. :( Haswell total cycles in shared programs: 411890163 -> 411891145 (<.01%) cycles in affected programs: 57016 -> 57998 (1.72%) helped: 3 HURT: 11 helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134 helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96% HURT stats (abs) min: 18 max: 686 x̄: 115.64 x̃: 20 HURT stats (rel) min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93% 95% mean confidence interval for cycles value: -51.39 191.67 95% mean confidence interval for cycles %-change: -0.14% 2.46% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 259114802 -> 259115032 (<.01%) cycles in affected programs: 24034 -> 24264 (0.96%) helped: 1 HURT: 9 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 18 max: 48 x̄: 25.78 x̃: 20 HURT stats (rel) min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80% 95% mean confidence interval for cycles value: 12.42 33.58 95% mean confidence interval for cycles %-change: 0.54% 1.38% Cycles are HURT. Signed-off-by:
Ian Romanick <ian.d.romanick@intel.com> Fixes: 5a02ffb7 nir: Rework lower_locals_to_regs to use deref instructions Reviewed-by:
Jason Ekstrand <jason@jlekstrand.net>
-
- 05 Jul, 2018 36 commits
-
-
Emma Anholt authored
The GLES3 CTS makes a lot more progress on a run now.
-
Emma Anholt authored
-
Emma Anholt authored
From the ARB_color_buffer_float spec: 35. Should the clamping of fragment shader output gl_FragData[n] be controlled by the fragment color clamp. RESOLVED: Since the destination of the FragData is a color buffer, the fragment color clamp control should apply. Fixes arb_color_buffer_float-mrt mixed on v3d. Reviewed-by:
Rob Clark <robdclark@gmail.com>
-
Emma Anholt authored
Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on more complicated cases by noticing if multiple RTs have the same blend state and emitting them in a single packet.
-
Emma Anholt authored
I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.
-
Emma Anholt authored
Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one
-
Emma Anholt authored
We don't actually set the two flags together, but I want to use the r/g/b/a reordered fields in the next commit.
-
Emma Anholt authored
The varying packing would result in st_nir_assign_var_locations() picking new driver_locations, despite the pipe_stream_output already being set up for the old driver location. This left the gallium driver with no way to work back to what varying was referenced by pipe_stream_output. Fixes these tests on V3D: dEQP-GLES3.functional.transform_feedback.random.separate.points.3 dEQP-GLES3.functional.transform_feedback.random.separate.points.7 dEQP-GLES3.functional.transform_feedback.random.separate.points.9 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.3 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.8 Reviewed-by:
Timothy Arceri <tarceri@itsqueeze.com>
-
Jon Turney authored
Set with_dri from with_gallium when DRI GLX is explicitly configured, as well as when DRI GLX is chosen automatically. Signed-off-by:
Jon Turney <jon.turney@dronecode.org.uk> Reviewed-by:
Dylan Baker <dylan@pnwbakers.com>
-
Samuel Pitoiset authored
Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
If the given image doesn't have HTILE, that's useless to flush DB metadata. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
GCC 4.8 fails to compile with "static const", while GCC 8.1 fails to compile with only "static". Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Lionel Landwerlin authored
mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] Fixes: b238e33b "kutil/queue: add a process name into a thread name" Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
The MSVC preprocessor doesnt understand #warning Fixes: 2e1e6511 ("util: extract get_process_name from xmlconfig.c") Reviewed-by:
Roland Scheidegger <sroland@vmware.com> Reviewed-by:
Emil Velikov <emil.velikov@collabora.com>
-
On Python 2, the default JSON separators are ', ' for items and ': ' for dicts. On Python 3, the default is the same when no indent is specified, but if one is (and we do specify one) then the default items separator becomes ',' (the dict separator remains unchanged). This change explicitly specifies the Python 3 default, which helps ensuring that the output is identical, whether it was generated by Python 2 or 3. Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Lionel Landwerlin authored
We already embed the headers, no need to redefine defines/structs. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
For gen8+, write out PPGTT tables in aub files so that full 48-bit addresses can be serialized. v2: Fix handling of `end` index in map_ppgtt v3: Correctly mark GGTT entry as present (Rafael) Signed-off-by:
Scott D Phillips <scott.d.phillips@intel.com> Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Scott added new stuff in IGT. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
With PPGTT mappings, our aubinator implementation can be quite slow if we request a buffer that doesn't exist. Instead of doing a PPGTT walk for invalid addresses (0 lengths), wait until we're sure we want to decode the data. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
v2: by Lionel Fix memfd_create compilation issue Fix pml4 address stored on 32 instead of 64bits Return no buffer if first ppgtt page is not mapped v3: Drop additional memfd_create() (Rafael) Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
We use memfd to store physical pages as they get read/written to and the GGTT entries translating virtual address to physical pages. Based on a commit by Scott Phillips. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
This is a simple, invasive, liberally licensed red-black tree implementation. It's an invasive data structure similar to the Linux kernel linked-list where the intention is that you embed a rb_node struct the data structure you intend to put into the tree. The implementation is mostly based on the one in "Introduction to Algorithms", third edition, by Cormen, Leiserson, Rivest, and Stein. There were a few other key design points: * It's an invasive data structure similar to the [Linux kernel linked list]. * It uses NULL for leaves instead of a sentinel. This means a few algorithms differ a small bit from the ones in "Introduction to Algorithms". * All search operations are inlined so that the compiler can optimize away the function pointer call. Reviewed-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com>
-
Lionel Landwerlin authored
Now that we're softpinning the address of our BOs in anv & i965, the addresses selected start at the top of the addressing space. This is a problem for the current implementation of aubinator which uses only a 40bit mmapped address space. This change keeps track of all the memory writes from the aub file and fetch them on request by the batch decoder. As a result we can get rid of the 1<<40 mmapped address space and only rely on the mmap aub file \o/ Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
On a follow up commit in this series, we stop copying the data from the mmap'ed file into our big gtt mmap, and start referencing data in it directly. So reallocating the read buffer and adding more data from stdin wouldn't work. For that reason, let's stop supporting stdin process. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Lionel Landwerlin authored
These memory offsets are stored in the gen_batch_decode_ctx. Signed-off-by:
Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by:
Rafael Antognolli <rafael.antognolli@intel.com>
-
Commit f69bc797 did the following: - if format.layout in ('bptc', 'astc'): + if format.layout in ('astc'): The intention was to go from matching either 'bptc' or 'astc' to matching only 'astc'. But the new code doesn't respect this intention any more, because in Python `('astc')` is not a tuple containing a string, it is just the string. (the parentheses are simply ignored) That means we now match any substring of 'astc', for example 'a'. This commit fixes the test to respect the original intention. Fixes: f69bc797 "gallium/auxiliary: Add helper support for bptc format compress/decompress" Reviewed-by:
Eric Engestrom <eric.engestrom@intel.com>
-
Samuel Pitoiset authored
Always emitting a bottom-of-pipe event is quite dumb. Instead, start to optimize these functions by syncing PFP for the top-of-pipe and syncing ME for the post-index-fetch event. This can still be improved by emitting EOS events for syncing PS and CS stages. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-
Samuel Pitoiset authored
This introduces radv_barrier() (same as the draw/dispatch codepath). This helper is used for merging the code from CmdWaitEvents() and CmdPipelineBarrier because it's quite similar. We do ignore the source stage mask for CmdWaitEvents because it's irrelevant when event objects are used. Signed-off-by:
Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by:
Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
-