Commits · lima-opt-indexed-draws_v2 · Andreas Baierl / mesa

Apr 28, 2020

WIP: lima: Split indexed draws for GL_ARRAY_BUFFER · 254fe84f

Andreas Baierl authored 4 years ago


It is possible to upload the index buffer with glBufferData and bind
it to GL_ARRAY_BUFFER. The data needs to be accessed with
PIPE_BIND_VERTEX_BUFFER then. Add the possibility to also split these
draws.

We don't create a shadow memory for that case in order to avoid creating
this shadow buffer every time we map a vertex buffer resource.
Instead, we map the GPU memory while preparing the draw directly before
calculating the splitted draw data. This seems to be the solution
with the least overhead.

An example for that is dEQP-GLES2.functional.buffer.write.use.index_array.array

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

254fe84f

WIP: lima: Use caching with splitted index buffer draws · a694c56c

Andreas Baierl authored 4 years ago


Use panfrosts index_cache mechanism to store the calculated min/max values
for the splitted draws.

Also do some small cleanups in the code.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

a694c56c

WIP: lima: Call lima_get_minmax with pipe_draw_info · 0b13783f
Andreas Baierl authored 4 years ago
```
No need to pass the single values

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
```
0b13783f

WIP: lima: Move minmax struct from context to resource · c29dd0c2

Andreas Baierl authored 4 years ago


The min/max indices belong to the resource, not to the context,
so move them.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

c29dd0c2

WIP: lima: Split indexed draws · 55d7ede6

Andreas Baierl authored 4 years ago


The blob splits indexed draws in cases when the index buffer is not
contiguous. This is done as soon as we have a gap > 8 in the index buffer.
Example:
index_buffer = [0, 10, 2] -> results in a single draw (0~10)
index_buffer = [0, 11, 2] -> results in two draws (0~2), (10)

So it seems, that the blob uses a gap of 11 - 2 = 9, from which on draws
are splitted.

This effort tries to implement this behaviour.

What it mainly does, is:
1) Allocate shadow CPU memory when a buffer for indices is created.
    This is currently only done, if the buffer is bound with
    PIPE_BIND_BUFFER_INDEX.
    There is a deqp test (functional.buffer.write.use.index_array.array),
    which uses PIPE_BIND_BUFFER_VERTEX to upload the indices. This case isn't
    handled in this MR, the blob does that split though.
2) Copy the dirty part of the shadow buffer back to GPU mem during unmapping
3) Within the indexed draw we do some more or less magic things :)
    We memcpy that shadow memory to a temporary one,
    do a quick sort on it and see how many draws we need, when splitting the draws,
    if the gap between the used indices is > 8.
    This can still be improved. We can use a better algorithm, to find the gaps
    and we should probably think about how to avoid the memcpy and qsort with each draw.
4) Fill an array of struct (ctx->minmax) with corresponding min/ max values
    depending on the number of draws we calculated for usage in
    VS CMD stream creation.
5) Use correct addresses for varyings and attributes including the right offset
6) Use the ctx->minmax array and create VS CMDS while iteration through the array

The main issue on this effort is, that we need to get benchmarks, if this really is
an optimization and if so, what the best value of the gap is.

We have to get some benchmarks first to drop the WIP.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

55d7ede6

WIP: panfrost: lima: Add void pointer to index cache · 75f900c1

Andreas Baierl authored 4 years ago

Prepare panfrosts index cache to take a pointer.
We need this for the follow-up lima commits, which can split an indexed draw
based on the index buffer. The information we need for the splitting will be
saved in an array. The index cache will keep a pointer to that array.

If we call _get and _add with a NULL-pointer, nothing new is done.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

75f900c1

Apr 22, 2020

lima: Reorder plbu draw cmds according to the blob · c933802e

Andreas Baierl authored 4 years ago


Though this has neither positive nor negative effects,
reorder the cmds to be blob-equivalent.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>

c933802e

iris: fail screen creation when kernel support is not there · f402b7c5

Lionel Landwerlin authored 4 years ago


v2: Bump check to I915_PARAM_HAS_CONTEXT_ISOLATION (v4.16) (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: mesa/mesa#2803


Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <mesa/mesa!4643>

f402b7c5

gitlab-ci: add a list of excluded tests for RADV · bca97abf

Samuel Pitoiset authored 4 years ago


Exclude WSI related tests in CI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <!4656>

bca97abf

meta,i965: Rip GL_EXT_texture_multisample_blit_scaled support out of meta · f1a12d68

Faith Ekstrand authored 4 years ago


i965 is the only driver that ever linked to this code and it's been
doing it in BLORP for a long time now.  The only possible case where it
would have fallen back to meta was for depth/stencil but that should
have ended starting with 6cec618e.  Rip out the dead code.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <mesa/mesa!4622>

f1a12d68

panfrost: Assert on unimplemented fragcoord etc · c6244f93
Alyssa Rosenzweig authored 4 years ago
```
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>
```
c6244f93

panfrost: Fix crashes with small BOs · 133c1aba

Alyssa Rosenzweig authored 4 years ago


Affects Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

133c1aba

pan/bi: Assert out multiple textures · 5c695210

Alyssa Rosenzweig authored 4 years ago


Only for a moment.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

5c695210

pan/bi: Pack TEX compact instructions · 3551c138

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

3551c138

pan/bi: Generate TEX_COMPACT instruction · cd5fe3b9

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

cd5fe3b9

pan/bi: Stub out tex_compact logic · 0769036a

Alyssa Rosenzweig authored 4 years ago


We may generate either texture type.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

0769036a

pan/bi: Add normal/compact/dual switch to IR · f85746af

Alyssa Rosenzweig authored 4 years ago


For tex.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

f85746af

pan/bi: Feed data register to BI_TEX · 93be49b1

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

93be49b1

pan/bi: Include TEX_COMPACT f16 opcode · 76d1bb03

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

76d1bb03

pan/bi: Structify TEX compact · bfc06b10

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

bfc06b10

pan/bi: Disassemble f16 dual tex · cf7b9523

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

cf7b9523

pan/bi: Document when dual-tex is triggered · a2c73535
Alyssa Rosenzweig authored 4 years ago
```
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>
```
a2c73535

pan/bi: Print tex_compact coordinates · 6fe41a12

Alyssa Rosenzweig authored 4 years ago


Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4671>

6fe41a12

intel/compiler: Put back saturate on [iu]add_sat opcodes · 902c8731

Kenneth Graunke authored 4 years ago


I deleted one too many inst->saturate = ... lines.  This one must stay.

Fixes: b7c47c4f ("intel/compiler: Drop nir_lower_to_source_mods() and related handling.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <mesa/mesa!4669>

902c8731

panfrost: Align Android makefiles with recent changes · f699bb42

Roman Stratiienko authored 4 years ago


Signed-off-by: Roman Stratiienko <roman.stratiienko@nure.ua>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <!4634>

f699bb42

Apr 21, 2020

freedreno/ir3: Drop handling FRAG_RESULT_DEPTH writing to .z · 2f4a3c1c

Emma Anholt authored 4 years ago

Since we consume NIR, we get FRAG_RESULT_DEPTH in .x.  Something must have
been working out for this code to not be trying to get an undefined value,
but go ahead and drop it now.

Part-of: <mesa/mesa!4668>

2f4a3c1c

turnip: fix GMEM resolve in CmdNextSubpass · eab73799

Jonathan Marek authored 4 years ago


The BLIT scissor must be set correctly for tu_store_gmem_attachment.

Fixes this deqp test:

dEQP-VK.pipeline.multisample_shader_builtin.sample_id.137_191_1.samples

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <mesa/mesa!4666>

eab73799

gitlab-ci: adapt query_traces_yaml to gitlab specific changes · e4521aea

Andres Gomez authored 4 years ago


This change was missing after acf7e73b "(gitlab-ci: make explicit
tracie is gitlab specific)".

Fixes: acf7e73b "(gitlab-ci: make explicit tracie is gitlab specific)".
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <mesa/mesa!4638>

e4521aea

egl: simplify client/platform extension handling · 0a884d73

Emil Velikov authored 4 years ago


For GLVND reasons the client/platform extensions strings should be
split. While in the non GLVND case they're one big string.

Currently we handle this distinction at run-time for not obvious reason.
Adding additional code and complexity.

Swap those with a few well placed #if USE_LIBGLVND guards.

As a side result this removes a minor memory leak due to the
concatenation in the non GLVND case.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Part-of: <mesa/mesa!4491>

0a884d73

mesa/gallium: do not use enum for bit-allocated member · 013d9e40

Erik Faye-Lund authored 4 years ago


The signedness of enums are undefined, so on platforms with signed
enums, this isn't going to work. One such platform is Microsoft Windows.

So let's just use an unsigned here instead.

Fixes: b1c4c4c7 ("mesa/gallium: automatically lower alpha-testing")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!4648>

013d9e40

util/ralloc: fix ralloc alignment on Win64 · a842dc15
Jesse Natalie authored 5 years ago
```
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <!4648>
```
a842dc15

intel/compiler: Drop nir_lower_to_source_mods() and related handling. · b7c47c4f

Kenneth Graunke authored 4 years ago


I think we're unanimous in wanting to drop nir_lower_to_source_mods.
It's a bit of complexity to handle in the backend, but perhaps more
importantly, would be even more complexity to handle in nir_search.

And, it turns out that since we made other compiler improvements in the
last few years, they no longer appear to buy us anything of value.
Summarizing the results from shader-db from this patch:

 - Icelake (scalar mode)

   Instruction counts:

   - 411 helped, 598 hurt (out of 139,470 shaders)
   - 99.2% of shaders remain unaffected.  The average increase in
     instruction count in hurt programs is 1.78 instructions.
   - total instructions in shared programs: 17214951 -> 17215206 (<.01%)
   - instructions in affected programs: 1143879 -> 1144134 (0.02%)

   Cycles:

   - 1042 helped, 1357 hurt
   - total cycles in shared programs: 365613294 -> 365882263 (0.07%)
   - cycles in affected programs: 138155497 -> 138424466 (0.19%)

 - Haswell (both scalar and vector modes)

   Instruction counts:

   - 73 helped, 1680 hurt (out of 139,470 shaders)
   - 98.7% of shaders remain unaffected.  The average increase in
     instruction count in hurt programs is 1.9 instructions.
   - total instructions in shared programs: 14199527 -> 14202262 (0.02%)
   - instructions in affected programs: 446499 -> 449234 (0.61%)

   Cycles:

   - 5253 helped, 5559 hurt
   - total cycles in shared programs: 359996545 -> 360038731 (0.01%)
   - cycles in affected programs: 155897127 -> 155939313 (0.03%)

Given that ~99% of shader-db remains unaffected, and the affected
programs are hurt by about 1-2 instructions - which are all cheap
ALU instructions - this is unlikely to be measurable in terms of
any real performance impact that would affect users.

So, drop them and simplify the backend, and hopefully enable other
future simplifications in NIR.

Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <mesa/mesa!4616>

b7c47c4f

meson: update llvm dependency logic for meson 0.54.0 · fdd0ce12

Dylan Baker authored 4 years ago


In meson 0.54.0 I fixed the llvm cmake dependency to return "not found"
if shared linking is requested. This means that for 0.54.0 and later we
don't need to do anything, and for earlier versions we only need to
change the logic to force the config-tool method if shared linking is
required.

Fixes: 821cf694
       ("meson: Use cmake to find LLVM when building for window")

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <mesa/mesa!4556>

fdd0ce12

remove final imports.h and imports.c bits · 8e369613

Dylan Baker authored 6 years ago


This moves the fi_types to a new mesa_private.h and removes the
imports.c file. The vast majority of this patch is just removing
pound includes of imports.h and fixing up the recursive includes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <!3024>

8e369613

dri/nouveau: replace assert with unreachable · 289f02d1

Dylan Baker authored 5 years ago


I don't know why removing imports.h suddenly makes clang realize that
this function can not return in a non-debug build, but it does.
Unreachable is better because it doesn't have this problem.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <mesa/mesa!3024>

289f02d1

mesa: move ADD_POINTERS to macros.h · c3db0936

Dylan Baker authored 6 years ago


I'm not really sure where else to put it. Since imports.h only has two
things left in it (neither of which are abstractions for smoothing away
libc differences) I'd like to get them out of there. macros.h is the
only place I can think of to put this macro.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <mesa/mesa!3024>

c3db0936

mesa|mapi: replace _mesa_[v]snprintf with [v]snprintf · bf188f34

Dylan Baker authored 5 years ago


MSVC 2015 and newer has perfectly valid snprintf and vsnprintf
implementations, let's just use those.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <mesa/mesa!3024>

bf188f34

replace imports memory functions with utils memory functions · c495c3af

Dylan Baker authored 6 years ago


Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <!3024>

c495c3af

util: Add an aligned realloc function · bb560f2d

Dylan Baker authored 6 years ago


Mesa has one of these in imports.h, so u_memory needs one as well. This
is the version from mesa ported.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <!3024>

bb560f2d

replace malloc macros in imports.h with u_memory.h versions · b8577590

Dylan Baker authored 6 years ago


Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <!3024>

b8577590

Admin message

Admin message