[TGL] GAMMA_LUT_SIZE and DEGAMMA_LUT_SIZE wrong

mentioned in merge request xorg/xserver!749 (merged)

@danvet @vsyrjala Any idea about this? kms driver problem? Hw issue? I think the GAMMA_LUT_SIZE is as intended for ICL+, after reading through the ICL/TGL/DG-1 PRM's and kms driver code, but something seems to be not right in the lower layers. This is throwing a wrench into the attempt to make some use of GAMMA_LUT on ICL+ for the X-Server 21.1 release, and is also not good for KDE/GNOME on Wayland afaiu from the linked bug reports, because using a legacy gamma lut size of 256 will trigger legacy hw gamma luts on Intel and we are back in the 90's wrt. color precision :(.

What i do see that the hw wants non-decreasing gamma lut's - and a check for that is missing for DRM_COLOR_LUT_NON_DECREASING in the GEN_11_FEATURES macro wrt. .color: https://elixir.bootlin.com/linux/v5.15-rc2/source/drivers/gpu/drm/i915/i915_pci.c#L814

But i don't think we submit decreasing LUT's at least for X-Server 1.21, and i at least assume that Kwin Wayland didn't either.

See also xorg/xserver#1193 (comment 1077710)

Thanks.

mentioned in issue xorg/xserver#1193 (closed)

That ridonculous LUT size is a disgusting hack due to the non-linear nature of the multi segment LUT we use atm (which is really only meant for HDR use cases). So I think what we really want is a new gamma uapi in kms that let's us select a use case appropriate LUT mode. For X11 that should IMO be the 256 entry legacy LUT in when running at 24bpp, and the 1024 entry 10bit non-interpolated LUT mode when running in 30bpp mode. Neither of those modes require the non-decreasing thing and thus would work for DirectColor visuals. For other non-HDR high color depth use cases that don't need DirectColor stuff we also have a linear 12bit interpolated mode available.

Agreed, in principle.

In the short term, i do wonder what we can do about X-Server 21.1 here, apart from giving up, which would be sad. Are you aware of any hardware errata or IGT test suite failures wrt. Tigerlake with non-decreasing LUT's? Our current use in XOrg master is suboptimal, submitting a staircase-shaped lut with each value repeating 256 times by naive upsampling from 1024 RandR input lut to 2^18+1 GAMMA_LUT, but i don't know why it would be decreasing at any point, and the PRM's don't forbid LUT's that are constant across successive values? Hard to wrap the head around it without actual hardware to test.

One thing that might happen is loading a temporary lut with random values during part of the server startup, which gets replaced a few milliseconds later with a non-decreasing upsampled lut. Could that somehow "jam" the display hardware and get it stuck on a broken lut?

One thing that might happen is loading a temporary lut with random values during part of the server startup, which gets replaced a few milliseconds later with a non-decreasing upsampled lut. Could that somehow "jam" the display hardware and get it stuck on a broken lut?

KWin doesn't do that so I think that's pretty unlikely

I suspect the non-decreasing restriction is just there because the interpolation doesn't work right otherwise. So I doubt it would cause any serious harm if you violate it. But I must admit that I don't think I ever tested it (at least intentionally) on any platform.

Hmm, so then it should work, but doesn't, because of goblins. I assume kwin_wayland's gamma implementation does load a monotonically increasing lut by default?

It's also weird that it worked for our tester on Icelake, but failed on Tigerlake, despite the PRM's and driver code suggesting identical behaviour.

i915-kms doesn't have by any chance some magic debug option or debugfs enty or something to dump what is actually loaded as GAMMA_LUT or hw lut?

Hmm. Any change in behaviour if you set has_dsb=0 in GEN12_FEATURES?

I don't think we have a state dump for the LUTs anywhere. I usually just intel_reg read such things. But IIRC at least on icl the hw was a bit borked and you can't really read the multi-segment LUT back out from the hw. I think there was some workaround involving changing the LUT to a different mode for the readout, and then back into multi-seg mode afterwards. Good enough for debugging, but sadly not usable for the automagic state checker we have in the driver.

I assume kwin_wayland's gamma implementation does load a monotonically increasing lut by default?

yes. AFAICT it's (like one would expect) just a 1:1 mapping of colors. Looking at the code though I think we might be getting integer overflows in our calculations with such huge gamma ramps, which might explain this quite easily... Is it possible that X has a similar problem somewhere? And does older hardware have similarly big gamma ramps, or are they much smaller?

@vsyrjala Don't have the hw myself, but our tester has, assuming he feels like recompiling and patching kernels: xorg/xserver#1193 (comment 1079549)

@Zamundaaa Conceivable, haven't looked at it from that perspective. Icelake and later are the only gpu's with > 4096 slots. I wish i had a modern Intel gpu for testing. Would beg the question though, why it worked for our tester on Icelake, where the same lut size was used.

There's more: with our current code, with the overflow fixed (hasn't been tested yet), the best we get is a 16 bit gamma curve stretched to fill the huge gamma curve, and no added resolution over a 2^16 big gamma curve. I assume we're supposed to do higher resolution calculations and then downsample, with the effect that highly nonlinear curves can be mapped better?

I just added a test patch to the X-Server: kleinerm/xserver@4da511ff

It fakes the 2^18 + 1 slot LUT of Tigerlake, goes through the moves, and checks for wraparound / overflow / decreasing LUT values: None of this happens under X, so this can't be the problem. - Unless it would behave differently on a real Tigerlake gpu ofc for some weird reason.

And indeed, with only 2^16 gamma values to fill up a 2^18 LUT, one can not avoid repeating sample values and just stretching the LUT, as @Zamundaaa says. In the case of X we repeat each value of the 1024 slot input lut about 256 times, which is a very stretched curve.

The intel kms driver itself does not use most of the 2^18 values on Icelake+, it only samples a subset of less than 1000 samples and ignores almost all other samples.

@vsyrjala So the test patch when executed at our testers machine with Tigerlake GT2 shows no wrongdoing from the X-Server. Also it loads a linear LUT which increases once every 5 steps, reaching 80% of the maximum at the end.

Still psychedelic colors, as photographed here: xorg/xserver#1193 (comment 1082680)

The user does use various i915 related kernel boot options though: mem_sleep_default=deep zswap.zpool=z3fold nmi_watchdog=0 memmap=12M$20M mitigations=off i915.mitigations=off iommu=pt dell-smm-hwmon.ignore_dmi=1 cpuidle_sysfs_switch=on pcie_aspm=force msr.allow_writes=on drm_kms_helper.poll=0 l1d_flush_out=off nvme_core.default_ps_max_latency_us=18000000 i915.enable_psr=1 i915.disable_power_well=1 i915.enable_dc=4 i915.enable_guc=2

The dsb feature - offloading mmio programming of the display hw to a coprocessor afaiu - seems to be only used on Tigerlake+ by default?

I've set has_dsb=0 in GEN12_FEATURES , rebuild kernel - and now in works with GammaLUT enabled and Xorg patch from Mario. Just wondering if it is OK to leave this has_dsb=0 for regular laptop use.

Bingo :-) -- Well maybe.

Mark, can you run it without my patch again, just from the Xorg master branch? And cat /proc/cmdline your kernel boot options with which it worked?

My patch was just a debug patch, to confirm we are likely doing nothing wrong in X-Server 21.1.

Now it looks like a kernel kms driver bug or Tigerlake hw quirk.

Yes, has_dsb=0 is fine. The DSB atm is a bit pointless and not used really well. We have plans to make more extensive use of it in the future though.

What's a bit strange is that we do have some gamma tests in igt that appear to be passing on tgl [1]. The tests aren't particularly extensive, but if it's totally incapable of loading a sensible LUT then I would expect even those tests to fail.

If you have a bit of time could you run the igt kms_color tests on your tgl with has_dsb=1 and let me know if how it went?

[1] https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10666/shard-tglb1/igt@kms_color@pipe-a-gamma.html

I just need more instructions hoe to run the test. And the main issue is that there is no errors - everything works fine, just the colors on the screen are terrible. But it works - compiz, unigine demos - look ugly but work and give no error messages.

Does it work with X-Server master now and just patched kernel with has_dsb=0 ? It hopefully should.

A packaged version of igt probably should do, but if you don't have one available you can just build it like so:

git clone git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
cd intel-gpu-tools 
meson build 
ninja -Cbuild

Might need to install various deps naturally. meson should complain about missing stuff.

Then kill your X/wayland compositor/whaever and run the test like so: ./build/tests/kms_color

Best do that from an ssh session though if you can, so you can actually see the terminal output. Or perhaps just redirect to a file and examine after the test has finished.

One more funny thing. I have a script to toggle day/night display mode by hotkey It just calling xcalib with 2 color profiles one click - xcalib -d :0 /usr/share/color/icc/Gamma6500K.icc second click- xcalib -d :0 /usr/share/color/icc/Gamma5500K.icc

and so on in turn.

When I boot the kernel with has_dsdb=1 , patched Xorg and start pressing this button changing gamma between 6500 and 5500 screen picture changes - every click it shows different magic mushroom landscape but sometimes screen becomes normal. WIth the next click it again start showing psychedelic pictures bit ocasionally becomes normal after some number of clicks. WIth 0 kernel ( and reverced debugging patch from Mario - otherwise it ignores xcalib) - everything works as expected.

Thanks Mark, very good! So without my patch to X-Server, in other words, just with the current XOrg master branch, everything works, as long as the kernel patch .has_dsb=0 is applied, even with all your special kernel boot options? Do i understand this correctly?

If so, then the X-Server should be fine, and it is a kernel problem.

Yes, exactly. has_dsb=0 solves the weird color issue.

mentioned in commit kleinerm/xserver@35c9ec08

mentioned in merge request xorg/xserver!758 (merged)

mentioned in commit kleinerm/xserver@287e79c9

mentioned in commit kleinerm/xserver@66e5a5bb

mentioned in commit p12tic/xserver@0d0986bf

commit 99510e1afb4863a225207146bd988064c5fd0629
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Thu Oct 14 21:18:56 2021 +0300

    drm/i915: Disable DSB usage for now

closed

@vsyrjala On current kernels, I can see that both dg1_info and adl_p_info reference XE_LPD_FEATURES, which in turn defines display.has_dsb = 1, overriding your patch? Is this intentional to leave it at old behaviour for DG2 and Alderlake-P?

mentioned in commit FireBurn/linux@99510e1a

mentioned in commit FireBurn/linux@a37795cb

mentioned in issue wlroots/wlroots#3365

marked #5047 (closed) as a duplicate of this issue

marked this issue as related to #5047 (closed)

mentioned in commit mrisaacb/xserver@f1abd058

[TGL] GAMMA_LUT_SIZE and DEGAMMA_LUT_SIZE wrong

Child items ...

Activity

Admin message

Admin message

[TGL] GAMMA_LUT_SIZE and DEGAMMA_LUT_SIZE wrong

Activity