intel/fs: Fix shift counts for 8- and 16-bit types
With regards to implicit masking of the shift counts for 8- and 16-bit types, the PRMs are incorrect. They falsely state that on Gen9+ only the
low bits of src1 matching the size of src0 (e.g., 4-bits for W
or UW
src0
) are used. The Bspec (backed by data from experimentation) state
that 0x3f is used for Q
and UQ
types, and 0x1f is used for all other types.
The match the behavior expected for the NIR opcodes, explicit masks for 8- and 16-bit types must be added.
The changes to fs_visitor::opt_algebraic
exist because the fs_visitor::nir_emit_alu
can generate AND
instructions with two constants.
This fixes (the updated version, see crucible!138 (merged)) of func.shader.shift.int16_t
on all Intel platforms.
No shader-db or fossil-db changes on any Intel platform.