Skip to content

lima/gpir: Support for branching and multiple basic blocks

This is something that I've been working on for a while on-and-off. I just got it to pass the first piglit test, so now it seems time to make an MR. However there are still a lot of follow-up things to do:

  • Add undef support.
  • Do a full piglit run to flush out any more bugs.
  • Add a NIR pass to do what the existing load-splitting pass does, since the existing pass can't work across basic-block boundaries. Right now it inserts register loads/stores when a load_uniform/input/constant instruction is in a different basic block from its use, which isn't great. This is WIP.
  • Clean up trivial basic blocks generated from conditional breaks/continues, and clean up the not instructions emitted when generating the branch conditions. These need to be done in gpir ATM since NIR doesn't support unstructured branching.
  • Spilling, and scratch support in general, isn't here yet. The tricky part here is dealing with the address registers, and the need for an extra value register while writing to scratch to load the address.
  • I want to rewrite the register allocator to happen in two passes, first allocate cross-block physical registers via graph coloring, and then allocate value registers via linear scan like the existing pass. The RA implemented here isn't very good at assigning colors smartly to reduce false dependencies, which might regress code size for some shaders, and it can't split live ranges for value registers (which is pretty much free) to avoid spilling. Also spilling has to happen over multiple rounds, which always sucks. Linear scan can spill directly while allocating, and for the first graph-coloring pass we can simply assign some registers directly to scratch while allocating and change load/store_reg to load/store_temp. The mesa allocator can't do that, and it has a bunch of stuff for register classes that we don't need, so that's why I'm not using it.
Edited by Connor Abbott

Merge request reports