src/gallium/auxiliary/util/u_vbuf.c · 80aca96803a37a7436ff96c0cec4a2643f11ed05 · Mesa / mesa

gallium/auxiliary: Reduce conversions in u_vbuf_get_minmax_index_mapped · 80aca968

Icecream95 authored Dec 11, 2019



With this patch, GCC generates vectorized code that does the comparisons
without converting the indices to 32-bit first.

This optimization makes the aforementioned function almost twice as fast
for ARM NEON, and should speed up vectorised code on other platforms.

Without vectorisation, the function is still a percent or two faster,
but slightly larger.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <mesa/mesa!3050>

80aca968