Skip to content

WIP: turnip: Add debug output and scripts to find unoptimal autotune decisions

To improve autotune heuristics we need to gather statistics about how renderpasses perform in sysmem and gmem modes alongside with high level parameters of these renderpasses.

This will allow to find renderpasses where autotune makes wrong decision and therefor improve it.

To achieve this we:

  • Add more information about renderpass to tracepoints.
  • Add debug option to dump all that autotune knows about renderpasses including the decision autotune would have made if it saw the same renderpass again.
  • Connect u_trace information about renderpasses and autotune information via same ids.

The scripts aggregate data from several sysmem and gmem runs allowing to compare renderpass performance between sysmem and gmem modes.

Steps to gather logs for script to process:

  1. Pin GPU frequency
 export GPU_PARAMS=/sys/devices/platform/soc@0/3d00000.gpu/devfreq/3d00000.gpu/
 cat $GPU_PARAMS/min_freq
 cat $GPU_PARAMS/max_freq > $GPU_PARAMS/min_freq
  1. Disable GPU autosuspend
echo 5000 > /sys/devices/platform/soc@0/3d00000.gpu/power/autosuspend_delay_ms
  1. Get RenderDoc capture or gfxreconstruct trace of a single frame and replay it several times in both sysmem and gmem modes
for i in {1..10}; do TU_DEBUG=profile_autotune,sysmem GPU_TRACE=1 renderdoccmd replay capture.rdc --loops 1 &>> capture.sysmem.log; done
for i in {1..10}; do TU_DEBUG=profile_autotune,gmem GPU_TRACE=1 renderdoccmd replay capture.rdc --loops 1 &>> capture.gmem.log; done
  1. Run generate_sysmem_gmem_comparison.py /folder/with/logs/ > result.csv

Example of resulting csv: test.csv

CC: @anholt

Merge request reports