ir3: lower 64b registers
After all int64/double lowerings, there might still be 64b registers left which ir3 currently doesn't handle. This only happens in a small number of Piglit tests where those registers (or the variables they come from) did not get DCE'd.
This patch handles 64b registers in ir3 by adding a NIR pass that does the following:
-
@reg_decl
-> split in two 32b ones -
@store_reg
->unpack_64_2x32_split_x/y
and two separate stores -
@load_reg
-> two separate loads andpack_64_2x32_split
For example, the relevant part of the fragment shader from spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-dvec3 gets lowered to the following NIR:
32 %1444 = @decl_reg () (num_components=3, num_array_elems=2, bit_size=64, divergent=1)
...
64x3 %9 = vec3 %5, %6, %8
@store_reg (%9, %1444) (base=0, wrmask=xyz, legacy_fsat=0)
...
64x3 %1446 = @load_reg_indirect (%1444, %30) (base=0, legacy_fabs=0, legacy_fneg=0)
The ir3_nir_lower_64b_regs
pass proposed by this patch lowers this to:
32 %1448 = @decl_reg () (num_components=3, num_array_elems=2, bit_size=32, divergent=1)
32 %1447 = @decl_reg () (num_components=3, num_array_elems=2, bit_size=32, divergent=1)
...
64x3 %9 = vec3 %5, %6, %8
32x3 %1449 = unpack_64_2x32_split_x %9
32x3 %1450 = unpack_64_2x32_split_y %9
@store_reg (%1449, %1448) (base=0, wrmask=xyz, legacy_fsat=0)
@store_reg (%1450, %1447) (base=0, wrmask=xyz, legacy_fsat=0)
...
32x3 %1453 = @load_reg_indirect (%1448, %30) (base=0, legacy_fabs=0, legacy_fneg=0)
32x3 %1454 = @load_reg_indirect (%1447, %30) (base=0, legacy_fabs=0, legacy_fneg=0)
64x3 %1455 = pack_64_2x32_split %1453, %1454
After this pass, the 64b vecs used for the original loads/stores are
still present and are also not handled yet by ir3. This patch removes
them by running nir_lower_alu_to_scalar
and nir_copy_prop
.
Fixes the following Piglit tests:
- spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-dvec3
- spec@arb_gpu_shader_fp64@uniform_buffers@fs-array-copy
- spec@arb_gpu_shader_fp64@uniform_buffers@gs-array-copy
- spec@arb_gpu_shader_fp64@uniform_buffers@vs-array-copy
- spec@arb_tessellation_shader@execution@variable-indexing@vs-output-array-dvec4-index-wr-before-tcs
This patch has no impact on shader-db.