pvr: correctly emit tpu_tag_cdm_ctrl
What does this MR do and why?
For GPUs with TPU_DM_GLOBAL_REGISTERS feature, an extra register field, tpu_tag_cdm_ctrl, is present in the compute command stream.
Emit it (currently hardcoded) to make compute working on GPUs with TPU_DM_GLOBAL_REGISTERS .