intel/compiler,nir: add support for 8/16 bits task payload loads & stores
Depends on !16852 (merged).
This is achieved by:
- extending nir_lower_task_shader to move task payload to shared memory when shader uses smaller than 32-bits task payload stores or loads
- modifying brw_nir_lower_mem_access_bit_sizes to load full 32-bit data of task payload and mask out unneeded bits (for mesh shaders)