Due to an influx of spam, we have had to impose restrictions on new accounts. Please see this wiki page for instructions on how to get full permissions. Sorry for the inconvenience.
Project 'drm/intel' was moved to 'drm/i915/kernel'. Please update any links and bookmarks that may still have the old path.
Issue observed on a ChromeOS hatch device (Cometlake) with ambient light sensor. In response to changes in ambient lighting, ChromeOS sets the CTM. We've observed that on the latest kernel, there is intermittent flickering (i.e. blank frames) when the CTM is being set.
I tested this on a ChromeOS kernel that's basically (c9c3395d5e3d Linux 6.2) with a bunch of stuff to enable building ChromeOS, so it should be pretty close to the upstream kernel.
My colleague determined that this issue started on (d13dde449580 drm/i915: Split pipe+output CSC programming to noarm+arm pair), which landed in the 5.19 kernel. Reverting this change causes the flickering to go away.
While debugging this issue, I also found that if the DMC firmware (i915/kbl_dmc_ver1_04.bin) fails to load, the issue is no longer reproducible. I'm not sure how relevant that point is, but thought it's worth mentioning.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
There is a known issue with the pipe CSC on skl/glk/icl where even reads of the CSC registers will supposedly disarm an already armed double buffer update. But at least in upstream we never read from those, except from the state checker which should only run after the update has already happened (so nothing to disarm anymore). You don't happen to have some extra patches in your tree that reads/writes the CSC registers somewhere else?
I tested this (on glk though, but should be no different really) and the double buffer update fail is definitely real. Any access to the coeff/preoff/postoff registers cancels the armed update.
But I don't think it can explain this bug as the time during which we must not touch any of the those registers is between the CSC_MODE write and start of vblank. And splitting the registers to noarm vs. arm did not move the CSC_MODE write anywhere. It's still being done at the exact same point as before.
So that makes this feels more like something would be corrupting the coefficients/offsets in the time between the color_commit_noarm() and color_commit_arm(). Either that or we somehow aren't even calling one of the pair of functions.
We don't have the CSC participating in the state checker atm. There was an attempt made at some point but I think it dried up: https://patchwork.freedesktop.org/series/88242/. That could have been helpful in figuring out if we indeed have corrupted/not updated the CSC somehow.
Also double checked BDW and TGL, and on those reads do not disarm. Writes still do, but that is also how many other sets of double buffered registers work, ie. perfectly normal.
I hacked together the CSC state readout+check: https://github.com/vsyrjala/linux.git csc_state_checker. That could at least tell us is if something is straight up corrupting the register values.
I checked this out, and copied all the chromeos config stuff alongside it, but the ambient color feature isn't working at all. I'll need to debug further to figure out if it's some config issue on my side, or something else that's not working properly on a more recent kernel. To be clear, I have no evidence to suggest that the regression is related to the csc_state_checker stuff or even a display issue in general - it's probably some kind of kernel feature detection that's not working as expected.
I'll circle back to this once I can get your csc_state_checker branch working.
I guess what might be interesting to to enable the intel_pipe_update* tracepoints, and the i915_reg_rw tracepoint (potentially filtering for just the pipe CSC registers if possible, not sure filtering for multiple regs is possible), and then check whether all register accesses land on appropriate sides of vblank boundaries around the time the flicker happens.
Capturing the exact frame that flickers in the trace would also be ideal, but not sure how we'd achieve that. crc would be one option but that would require some kind of setup where every correct frame would produce a steady and/or known crc normally. Or some external sensor could be used I guess. But that's all getting a wee bit complicated.
It's the usual /sys/kernel/debug/tracing stuff (or used via trace-cmd etc.). But I don't think we really need that data anymore. @ideak managed to reproduce all kinds of funny dc5/psr/csc interactions here, so the problem is confirmed to be real. But we'll need to do a bit more hardware pokery to figure out if other parts of the hardware besides csc is also affected some degree.
Thanks for the update. So there is no longer a need for me to provide further logs/etc? Please let me know if there is anything I can provide. Otherwise, I'll keep an eye on this issue for further updates.
You don't happen to have some extra patches in your tree that reads/writes the CSC registers somewhere else?
I don't think so. I double-checked what's applied on top of (c9c3395d5e3d Linux 6.2), and there are a bunch of android related merges, and chromeos config stuff. To be exact, it's this commit that I've tested.
To verify that there aren't any changes affecting i915 in that range:
There is a known issue with the pipe CSC on skl/glk/icl where even reads of the CSC registers will supposedly disarm an already armed double buffer update
Is it possible that such a read is happening from the DMC firmware? As mentioned, this issue is not reproducible if the DMC firmware is not loaded.
@vsyrjala I noticed that the color_commit_arm calls intel_de_write_fw() to write to the PIPE_CSC_MODE reg, doesnt this go through DMC FW to write to the Pipe_csc_mode reg? Could be related to us not seeing the flicker when DMC not loaded?
We checked it with Imre and DMC does save/restore the CSC registers. But it does the restore in the correct order (CSC_MODE last).
The only way I can think how DMC could accidentally disarm the update is if it does the state save (and obviously no restore) after we write CSC_MODE in the color_commit_arm(), but before the vblank when the double buffering latches. But that would be equally broken without the arm vs. noarm split as well. So the explanation must be something different.
This (and other PSR/DMC/CSC issues) should now be fixed:
41b4c7fe72b6 drm/i915: Disable DC states for all commits92736f1b452b drm/i915: Workaround ICL CSC_MODE sticky arming3962ca4e080a drm/i915: Add a .color_post_update() hook80a892a4c242 drm/i915: Move CSC load back into .color_commit_arm() when PSR is enabled on skl/glkf161eb01f50a drm/i915: Split icl_color_commit_noarm() from skl_color_commit_noarm()