llvmpipe: Per quad interpolation.
First interpolate the 4 quads upper left corners, then sub-interpolate each quad pixel. Do the perspective divide once per quad. Saves some muls and reciprocates. But doesn't seem to make a noticeable improvement. It make the code simpler and more compact, so commiting anyway.
Showing with 136 additions and 190 deletions