NVE6 (GK106) memory re-clocking breaks GpuTest plot3d benchmark
@menzac
Submitted by menzac Assigned to Nouveau Project
Link to original bug (#111724)
Description
I have stepped upon a problem with NVE6 (GK106) in GpuTest https://www.geeks3d.com/gputest/ plot3d benchmark that occurs only in plot3d and nowhere else. There are visible glitches and when left for a longer time Nouveau seems to crash.
The GPU has 4 profiles:
07: core 324 MHz memory 648 MHz
0a: core 324-862 MHz memory 1620 MHz
0d: core 549-1228 MHz memory 6008 MHz
0f: core 549-1228 MHz memory 6008 MHz
The problem occurs when switching re-clocking profile directly from 648 MHz to 6008 MHz skipping the 0xA 1620 MHz profile. If gone through 0xA profile everything works fine.
If the memory re-clocking is disabled, it works fine. If there is 0xF profile set directly (breaking the benchmark) with memory re-clocking enabled, then the nouveau gets unloaded, and nouveau gets loaded back with memory re-clocking disabled, when changing re-clocking profiles it still glitches. Which implies something that breaks this is only touched when the memory re-clocking is enabled.
I have gone through all nouveau pmu scripts traces, checked every difference (of the scripts) with Nvidia driver and nothing seemed to affect this problem that has different values than Nvidia. Actual code which was changing the values for 0xf profile to be same as Nvidia is here: https://github.com/mmenzyns/nouveau/tree/linux-5.2_gk106_memory_issues. The scripts for the highest-profile should be almost identical between Nvidia and Nouveau.