Draft: lima: wire up MSAA 4x support
Utgard supports MSAA 4x, so wire it up.
RSW bits were already REd by Luc, the only missing part was MSAA for depth/stencil buffer, and it turns out that MSAA 4x isn't actually free if you need to store depth or stencil buffers. In this case it requires 4x buffer size, and for reload it's necessary to reload each sample individually, so it's 4x memory bandwidth for depth/stencil reload with MSAA 4x
As a side fix, it turns out that our wb_reg definition wasn't correct, 'zero' isn't always zero, it's set if we need to swap channels, and it goes before mrt_bits. mrt_bits actually enables multiple MRTs - blob sets it to 0xf for depth/stencil reload with MSAA enabled, and mrt_pitch is set to mrt_pitch (in bytes), so rename zero to flags and change its order.
Fixes dEQP-GLES2.functional.multisample.*