Implement task shader support (but not yet the task/mesh draw commands) for the NV_mesh_shader extension.
Task shader is an optional stage that can run before a Mesh shader in a graphics pipeline. It's a compute-like stage whose primary output is the number of launched mesh shader workgroups (1 task shader workgroup can launch up to 2^22 mesh shader workgroups), and also has an optional payload output which is up to 16K bytes.
Task shaders on RDNA2 work by submitting two queues in parallel: a compute queue which executes the actual task shader (compiled into a compute shader), and a graphics queue which executes the mesh shader and everything else. It is required that the compute submission is added as a scheduled dependency to the graphics submission.
nir_var_mem_task_payload is added which can represent the task shader output payload. This is necessary because
nir_var_shader_out is limited to 32x vec4 generic outputs, so can't fit the 16K.
firstTask in Task shaders (not supported natively by the hardware)