Skip to content

lima/ppir: implement full liveness analysis for regalloc

Erico Nunes requested to merge enunes/mesa:ppir-new-regalloc into master

The existing liveness analysis in ppir still ultimately relies on a single continuous live_in and live_out range per register and was observed to be the bottleneck for register allocation on complicated examples with several control flow blocks. The use of live_in and live_out ranges was fine before ppir got control flow, but now it ends up creating unnecessary interferences as live_in and live_out ranges may span across entire blocks after blocks get placed sequentially.

This new liveness analysis implementation generates a set of live variables at each program point; before and after each instruction and beginning and end of each block. This is a global analysis and propagates the sets of live registers across blocks independently of their sequence. The resulting sets optimally represent all variables that cannot share a register at each program point, so can be directly translated as interferences to the register allocator.

Special care has to be taken with non-ssa registers. In order to properly define their live range, their alive components also need to be tracked. Therefore ppir can't use simple bitsets to keep track of live registers.

The algorithm uses an auxiliary set data structure to keep track of the live registers. The initial implementation used only trivial arrays, however regalloc execution time was then prohibitive (>1minute on Cortex-A53) on extreme benchmarks with hundreds of instructions, hundreds of registers and several spilling iterations, mostly due to the n^2 complexity to generate the interferences from the live sets. Since the live registers set are only a very sparse subset of all registers at each instruction, iterating only over this subset allows it to run very fast again (a couple of seconds for the same benchmark).

Merge request reports