lima/gpir: A few optimizations for branches
This series for gpir fixes a few obvious inefficiencies with the code generated for multi-basic-block shaders, and also fixes regressions in code size for complicated single-block shaders due to the new register allocator. It also lays the foundation for how spilling will work.