r600: Cayman SPI_LDS_MGMT.NUM_LS_LDS 8160 dword limit can possibly be disabled for full 32 KB of compute shared memory?
The Gallium R600 driver currently sets SPI_LDS_MGMT.NUM_LS_LDS
to 255 on Northern Islands, which means 8160 dwords (there's also an lds_size <= 8160
assertion in evergreen_emit_dispatch
for Cayman).
However, Northern Islands is a Direct3D 11 GPU family, and D3D11 requires full 8192 dwords of groupshared memory, so there likely must be a way to use 8192 dwords on it.
According to the SPI_LDS_MGMT
documentation:
NUM_PS_LDS
PS LDS limit, format is [12:5]. A setting of 1 means 32 dwords, 255 means 8160 dwords, 0 disables the limit.
Since NUM_LS_LDS
is documented as "same desc[ription] as PS", I think NUM_LS_LDS
can be set to 0 too to disable the limit in compute shaders.
How safe would it be to use the value 0 rather than 255 when the compute shader needs 32 KB of shared memory?
There's also the SQ_LDS_ALLOC
register, however, but that's undocumented unfortunately — but given that num_waves
is placed at << 14
in it, unless there's some flag at [13], there should be 14 bits for the LDS size — which is enough for 8192 (14 bits can hold values up to 16383). SQ_LDS_ALLOC
is configured the same way on Evergreen and Northern Islands in the Gallium R600 driver, however, so 8192 is usable on Evergreen there — and I'd expect that to be the case on Northern Islands too.