Commit 8bd9a9c4 authored by Eduardo Lima Mitev's avatar Eduardo Lima Mitev

ir3/compiler: Use the NIR computed offset for image store/atomics

Remove the offset computation from the a4xx/a5xx backend and use the
one pre-computed in NIR by lower_io_offsets pass.

No regressions observed on affected tests from Khronos CTS and piglit
suites, compared to master.

Unfortunately shader-db is not helpful for stats in this case. Few
shaders there exercise image store or image atomic, and of those that
do, most require higher versions of GLSL than 3.10, so they get skipped.
parent 9d5407b2
Pipeline #53533 passed with stages
in 23 minutes and 2 seconds
......@@ -208,46 +208,15 @@ emit_intrinsic_atomic_ssbo(struct ir3_context *ctx, nir_intrinsic_instr *intr)
static struct ir3_instruction *
get_image_offset(struct ir3_context *ctx, const nir_variable *var,
struct ir3_instruction * const *coords, bool byteoff)
struct ir3_instruction * const *coords)
{
struct ir3_block *b = ctx->block;
struct ir3_instruction *offset;
unsigned ncoords = ir3_get_image_coords(var, NULL);
/* to calculate the byte offset (yes, uggg) we need (up to) three
* const values to know the bytes per pixel, and y and z stride:
/* ir3_nir_lower_io_offsets pass should have placed the final
* byte-offset (or dword offset for atomics) at the 4th component
* of the coordinate vector.
*/
struct ir3_const_state *const_state = &ctx->so->shader->const_state;
unsigned cb = regid(const_state->offsets.image_dims, 0) +
const_state->image_dims.off[var->data.driver_location];
debug_assert(const_state->image_dims.mask &
(1 << var->data.driver_location));
/* offset = coords.x * bytes_per_pixel: */
offset = ir3_MUL_S(b, coords[0], 0, create_uniform(b, cb + 0), 0);
if (ncoords > 1) {
/* offset += coords.y * y_pitch: */
offset = ir3_MAD_S24(b, create_uniform(b, cb + 1), 0,
coords[1], 0, offset, 0);
}
if (ncoords > 2) {
/* offset += coords.z * z_pitch: */
offset = ir3_MAD_S24(b, create_uniform(b, cb + 2), 0,
coords[2], 0, offset, 0);
}
if (!byteoff) {
/* Some cases, like atomics, seem to use dword offset instead
* of byte offsets.. blob just puts an extra shr.b in there
* in those cases:
*/
offset = ir3_SHR_B(b, offset, 0, create_immed(b, 2), 0);
}
return ir3_create_collect(ctx, (struct ir3_instruction*[]){
offset,
create_immed(b, 0),
coords[3],
create_immed(ctx->block, 0),
}, 2);
}
......@@ -270,7 +239,7 @@ emit_intrinsic_store_image(struct ir3_context *ctx, nir_intrinsic_instr *intr)
* src2 is 64b byte offset
*/
offset = get_image_offset(ctx, var, coords, true);
offset = get_image_offset(ctx, var, coords);
/* NOTE: stib seems to take byte offset, but stgb.typed can be used
* too and takes a dword offset.. not quite sure yet why blob uses
......@@ -311,7 +280,7 @@ emit_intrinsic_atomic_image(struct ir3_context *ctx, nir_intrinsic_instr *intr)
*/
src0 = ir3_get_src(ctx, &intr->src[3])[0];
src1 = ir3_create_collect(ctx, coords, ncoords);
src2 = get_image_offset(ctx, var, coords, false);
src2 = get_image_offset(ctx, var, coords);
switch (intr->intrinsic) {
case nir_intrinsic_image_deref_atomic_add:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment