Try to fix gamma handling for GAMMA_LUT_SIZE > 1024.

Merged Mario Kleiner requested to merge kleinerm/xserver:icelakefixes into master

While MR !747 (merged) does work around limitations of the X-Servers gamma handling code and does fix #1193 (closed), this comes at a heavy disadvantage, especially for Intel Icelake and later. Whenever the legacy gamma lut is used on intel kms, the driver will switch back to the legacy hardware gamma lut, which is 256 slots, 8 bit wide, with no interpolation from framebuffer to lut. This means that framebuffers of any depth > 8 bpc are treated like 8 bpc framebuffers, and output color precision is limited to 8 bpc like in prehistoric times. Deep color modes like 10 bpc / depth 30 are no longer usable, neither are fp16 or fixed point rgba16 modes which are supported by these gpu's. This makes them behave worse wrt. precision than 10 year old Intel Ironlake - Kabylake!

Similar issues exist for AMD, MALI, KOMEDA and MEDIATHEK kms drivers, where a GAMMA_LUT of 4096 or 512 slots is normally used, and various restricitions apply for the legacy 256 slots lut.

This MR tries to work around the problem in a different way:

  1. Making sure that the memcpy() which can cause a server crash by reading past the end of a lut (see #1193 (closed)) can no longer read past the end of the lut. This will avoid the crash. Looking at debug output showed that the memcpy path is only taken during server startup (xf86HandleColormaps() and earlier), but then a palette is created, so the server takes the path through xf86RandR12CrtcComputeGamma(), which makes sure that data is properly up- or down-sampled for mismatching source and target crtc gamma sizes, at least as long as the target crtc's gamma lut (ie. GAMMA_LUT) has a power-of-two size.

  2. Extend xf86RandR12CrtcComputeGamma() to deal a bit better with non-power-of-two sizes. We now replicate the last properly assigned value into all slots of the crtc gamma table which would normally not be reached. This is not perfect, but shoud be better than leaving those slots unassigned. For reference, a check of all current kms drivers showed that they all use GAMMA_LUTs with a power of two, typically 256, 512, 1024 or 4096. Only Intel is an exception with Icelake and later and their lut's of 2^18 + 1 slots size. here the last slot is important, so must be assigned at least a reasonable value.

  3. Revert the restrictions of only accepting GAMMA_LUT's of size 1024.

Successfully tested against the modesetting-ddx and current amdgpu-ddx with screen color depth 24 (8 bpc) and 30 (10 bpc), to make sure that clamping happens properly. Interestingly, testing on AMD hw with a GAMMA_LUT size of 4096 slots never caused a crash for myself, although the memcpy() reads past the randr_crtc's 1024 slot lut. Apparently this works due to dumb luck - reading some memory that is allocated but unrelated.

I could not test on Icelake+ because i don't have suitable hardware, but i hope this is a reasonable improvement.

Signed-off-by: Mario Kleiner

Merge request reports