WIP: nv50/ir: improve LoadPropagation with a static constant table
Normally one way to get rid of constants load is to extract all constants and have a buffer attached to the shader as another compilation result. One big disadvantage with that is, that it would also require reuploading a different constant table each time a different shader gets bound.
This drawback can be mostly mitigated by having a single static table with the most used constants, which can't be optimized in any other way. This table get uploaded once at screen creation time and won't be touched ever again.
Because this table is put into the driver constbuf we don't even use more memory as the full buffer is already allocated anyway.
From experiments using a per shader constant table would give around 25% better results, but this can be added later on still.
The biggest concern with a per shader table is, that CPU bound workloads would be the most affected, especially games like Civilization IV and V doing a lot of different draws with a lot of different shaders.
total instructions in shared programs : 10240501 -> 10188599 (-0.51%)
total gprs used in shared programs : 1125555 -> 1119673 (-0.52%)
total shared used in shared programs : 702868 -> 702868 (0.00%)
total local used in shared programs : 36424 -> 36420 (-0.01%)
local shared gpr inst bytes
helped 1 0 4704 20948 20948
hurt 0 0 69 214 214