lima: ppir: always use vec4 for output register
What does this MR do and why?
lima: ppir: always use vec4 for output register
gl_FragDepth is a float, but the hardware still uses a vec4 register, .x component for depth and another component for stencil, so we have to always allocate a vec4 for output.